Less anger inducing pull request for 4.11

-----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYr5aeAAoJEAx081l5xIa+ZK4P/RD3XUsduYqziVFCRQ2n0X8r +D92F4peTnSeSq7ZcZvprv+fezUGAHbfsWFs8feYCI5quUO6pEQSPwN+wyGazUi0 4hUVB/K9Iq7U/Bj7Z/SmsU3NuWJnkNqbmvSFvUdqYK9D/kl+Tnllzap2N4cTzjwu GZOObz4n85cx94NqC3qw+7/ptL1X2MhXa+z0MzbkKyas84Bko1LwCSHRHsDKUnJc IcSpOcYZ6pSRMIsKH4Kd79Go4vWm7djXT9XL3PwDk2NcXXUOuR+cfdHqYchYaM/O iD2hvaSywBcflxSAml5x6vlXraoRd91ZZulgOObXtFfnUXdZB81TVq4uv6LU4Bx3 jLFixUZuk/TJT+W/8N10l7M6yMIFaTpNoNMc5n4IF5RNNyWba4BKnrI+f+lQiOpY mmjIaidb0t5BICnJzCD264RhCEXmP0HaDV+iQQV6y6jJRXfd1bgnOXLKP73JekzB TsbDshCoE7UO0dJ7n0LFpXSTQDTYzlazoEp14f2kFBxir5/l7r67nUlnDTvUQfuN tSRvpN/s0wqvH3o7zhmpHxyJ/ZasPMQjNCFAuUEbx8L5SKXsua0FubIzN4aVpilb XvfdFRWM/lkOT/q+8cGI/TcE3YTqEmALmGxdV/akbdNCiCg6aClyCLRE/DZhgmSQ UMFjr9wlHl5Qo/OqLKj0 =Yjfg -----END PGP SIGNATURE----- Merge tag 'drm-for-v4.11-less-shouty' of git://people.freedesktop.org/~airlied/linux Pull drm updates from Dave Airlie: "This is the main drm pull request for v4.11. Nothing too major, the tinydrm and mmu-less support should make writing smaller drivers easier for some of the simpler platforms, and there are a bunch of documentation updates. Intel grew displayport MST audio support which is hopefully useful to people, and FBC is on by default for GEN9+ (so people know where to look for regressions). AMDGPU has a lot of fixes that would like new firmware files installed for some GPUs. Other than that it's pretty scattered all over. I may have a follow up pull request as I know BenH has a bunch of AST rework and fixes and I'd like to get those in once they've been tested by AST, and I've got at least one pull request I'm just trying to get the author to fix up. Core: - drm_mm reworked - Connector list locking and iterators - Documentation updates - Format handling rework - MMU-less support for fbdev helpers - drm_crtc_from_index helper - Core CRC API - Remove drm_framebuffer_unregister_private - Debugfs cleanup - EDID/Infoframe fixes - Release callback - Tinydrm support (smaller drivers for simple hw) panel: - Add support for some new simple panels i915: - FBC by default for gen9+ - Shared dpll cleanups and docs - GEN8 powerdomain cleanup - DMC support on GLK - DP MST audio support - HuC loading support - GVT init ordering fixes - GVT IOMMU workaround fix amdgpu/radeon: - Power/clockgating improvements - Preliminary SR-IOV support - TTM buffer priority and eviction fixes - SI DPM quirks removed due to firmware fixes - Powerplay improvements - VCE/UVD powergating fixes - Cleanup SI GFX code to match CI/VI - Support for > 2 displays on 3/5 crtc asics - SI headless fixes nouveau: - Rework securre boot code in prep for GP10x secure boot - Channel recovery improvements - Initial power budget code - MMU rework preperation vmwgfx: - Bunch of fixes and cleanups exynos: - Runtime PM support for MIC driver - Cleanups to use atomic helpers - UHD Support for TM2/TM2E boards - Trigger mode fix for Rinato board etnaviv: - Shader performance fix - Command stream validator fixes - Command buffer suballocator rockchip: - CDN DisplayPort support - IOMMU support for arm64 platform imx-drm: - Fix i.MX5 TV encoder probing - Remove lower fb size limits msm: - Support for HW cursor on MDP5 devices - DSI encoder cleanup - GPU DT bindings cleanup sti: - stih410 cleanups - Create fbdev at binding - HQVDP fixes - Remove stih416 chip functionality - DVI/HDMI mode selection fixes - FPS statistic reporting omapdrm: - IRQ code cleanup dwi-hdmi bridge: - Cleanups and fixes adv-bridge: - Updates for nexus sii8520 bridge: - Add interlace mode support - Rework HDMI and lots of fixes qxl: - probing/teardown cleanups ZTE drm: - HDMI audio via SPDIF interface - Video Layer overlay plane support - Add TV encoder output device atmel-hlcdc: - Rework fbdev creation logic tegra: - OF node fix fsl-dcu: - Minor fixes mali-dp: - Assorted fixes sunxi: - Minor fix" [ This was the "fixed" pull, that still had build warnings due to people not even having build tested the result. I'm not a happy camper I've fixed the things I noticed up in this merge. - Linus ] * tag 'drm-for-v4.11-less-shouty' of git://people.freedesktop.org/~airlied/linux: (1177 commits) lib/Kconfig: make PRIME_NUMBERS not user selectable drm/tinydrm: helpers: Properly fix backlight dependency drm/tinydrm: mipi-dbi: Fix field width specifier warning drm/tinydrm: mipi-dbi: Silence: ‘cmd’ may be used uninitialized drm/sti: fix build warnings in sti_drv.c and sti_vtg.c files drm/amd/powerplay: fix PSI feature on Polars12 drm/amdgpu: refuse to reserve io mem for split VRAM buffers drm/ttm: fix use-after-free races in vm fault handling drm/tinydrm: Add support for Multi-Inno MI0283QT display dt-bindings: Add Multi-Inno MI0283QT binding dt-bindings: display/panel: Add common rotation property of: Add vendor prefix for Multi-Inno drm/tinydrm: Add MIPI DBI support drm/tinydrm: Add helper functions drm: Add DRM support for tiny LCD displays drm/amd/amdgpu: post card if there is real hw resetting performed drm/nouveau/tmr: provide backtrace when a timeout is hit drm/nouveau/pci/g92: Fix rearm drm/nouveau/drm/therm/fan: add a fallback if no fan control is specified in the vbios drm/nouveau/hwmon: expose power_max and power_crit ..
2017-02-23 18:58:18 -08:00 · 2017-02-23 18:58:18 -08:00 · ef96152e6a
parent d5500a0747 64a577196d
commit ef96152e6a
923 changed files with 46428 additions and 22570 deletions
--- a/Documentation/devicetree/bindings/display/brcm,bcm-vc4.txt
+++ b/Documentation/devicetree/bindings/display/brcm,bcm-vc4.txt
@ -56,6 +56,18 @@ Required properties for V3D:
 - interrupts:	The interrupt number
 		  See bindings/interrupt-controller/brcm,bcm2835-armctrl-ic.txt

+Required properties for DSI:
+- compatible:	Should be "brcm,bcm2835-dsi0" or "brcm,bcm2835-dsi1"
+- reg:		Physical base address and length of the DSI block's registers
+- interrupts:	The interrupt number
+		  See bindings/interrupt-controller/brcm,bcm2835-armctrl-ic.txt
+- clocks:	a) phy: The DSI PLL clock feeding the DSI analog PHY
+		b) escape: The DSI ESC clock from CPRMAN
+		c) pixel: The DSI pixel clock from CPRMAN
+- clock-output-names:
+		The 3 clocks output from the DSI analog PHY: dsi[01]_byte,
+		dsi[01]_ddr2, and dsi[01]_ddr
+
 [1] Documentation/devicetree/bindings/media/video-interfaces.txt

 Example:
@ -99,6 +111,29 @@ dpi: dpi@7e208000 {
 	};
 };

+dsi1: dsi@7e700000 {
+	compatible = "brcm,bcm2835-dsi1";
+	reg = <0x7e700000 0x8c>;
+	interrupts = <2 12>;
+	#address-cells = <1>;
+	#size-cells = <0>;
+	#clock-cells = <1>;
+
+	clocks = <&clocks BCM2835_PLLD_DSI1>,
+		 <&clocks BCM2835_CLOCK_DSI1E>,
+		 <&clocks BCM2835_CLOCK_DSI1P>;
+	clock-names = "phy", "escape", "pixel";
+
+	clock-output-names = "dsi1_byte", "dsi1_ddr2", "dsi1_ddr";
+
+	pitouchscreen: panel@0 {
+		compatible = "raspberrypi,touchscreen";
+		reg = <0>;
+
+		<...>
+	};
+};
+
 vec: vec@7e806000 {
 	compatible = "brcm,bcm2835-vec";
 	reg = <0x7e806000 0x1000>;
--- a/Documentation/devicetree/bindings/display/bridge/adi,adv7511.txt
+++ b/Documentation/devicetree/bindings/display/bridge/adi,adv7511.txt
@ -38,10 +38,22 @@ The following input format properties are required except in "rgb 1x" and
 - adi,input-justification: The input bit justification ("left", "evenly",
  "right").

+- avdd-supply: A 1.8V supply that powers up the AVDD pin on the chip.
+- dvdd-supply: A 1.8V supply that powers up the DVDD pin on the chip.
+- pvdd-supply: A 1.8V supply that powers up the PVDD pin on the chip.
+- dvdd-3v-supply: A 3.3V supply that powers up the pin called DVDD_3V
+  on the chip.
+- bgvdd-supply: A 1.8V supply that powers up the BGVDD pin. This is
+  needed only for ADV7511.
+
 The following properties are required for ADV7533:

 - adi,dsi-lanes: Number of DSI data lanes connected to the DSI host. It should
  be one of 1, 2, 3 or 4.
+- a2vdd-supply: 1.8V supply that powers up the A2VDD pin on the chip.
+- v3p3-supply: A 3.3V supply that powers up the V3P3 pin on the chip.
+- v1p2-supply: A supply that powers up the V1P2 pin on the chip. It can be
+  either 1.2V or 1.8V.

 Optional properties:

--- a/Documentation/devicetree/bindings/display/bridge/dw_hdmi.txt
+++ b/Documentation/devicetree/bindings/display/bridge/dw_hdmi.txt
@ -1,52 +1,33 @@
-DesignWare HDMI bridge bindings
+Synopsys DesignWare HDMI TX Encoder
+===================================

-Required properties:
- compatible: platform specific such as:
-   * "snps,dw-hdmi-tx"
-   * "fsl,imx6q-hdmi"
-   * "fsl,imx6dl-hdmi"
-   * "rockchip,rk3288-dw-hdmi"
- reg: Physical base address and length of the controller's registers.
- interrupts: The HDMI interrupt number
- clocks, clock-names : must have the phandles to the HDMI iahb and isfr clocks,
-  as described in Documentation/devicetree/bindings/clock/clock-bindings.txt,
-  the clocks are soc specific, the clock-names should be "iahb", "isfr"
-port@[X]: SoC specific port nodes with endpoint definitions as defined
-   in Documentation/devicetree/bindings/media/video-interfaces.txt,
-   please refer to the SoC specific binding document:
-    * Documentation/devicetree/bindings/display/imx/hdmi.txt
-    * Documentation/devicetree/bindings/display/rockchip/dw_hdmi-rockchip.txt
+This document defines device tree properties for the Synopsys DesignWare HDMI
+TX Encoder (DWC HDMI TX). It doesn't constitue a device tree binding
+specification by itself but is meant to be referenced by platform-specific
+device tree bindings.

-Optional properties
- reg-io-width: the width of the reg:1,4, default set to 1 if not present
- ddc-i2c-bus: phandle of an I2C controller used for DDC EDID probing,
-  if the property is omitted, a functionally reduced I2C bus
-  controller on DW HDMI is probed
- clocks, clock-names: phandle to the HDMI CEC clock, name should be "cec"
+When referenced from platform device tree bindings the properties defined in
+this document are defined as follows. The platform device tree bindings are
+responsible for defining whether each property is required or optional.

-Example:
-	hdmi: hdmi@0120000 {
-		compatible = "fsl,imx6q-hdmi";
-		reg = <0x00120000 0x9000>;
-		interrupts = <0 115 0x04>;
-		gpr = <&gpr>;
-		clocks = <&clks 123>, <&clks 124>;
-		clock-names = "iahb", "isfr";
-		ddc-i2c-bus = <&i2c2>;
+- reg: Memory mapped base address and length of the DWC HDMI TX registers.

-		port@0 {
-			reg = <0>;
+- reg-io-width: Width of the registers specified by the reg property. The
+  value is expressed in bytes and must be equal to 1 or 4 if specified. The
+  register width defaults to 1 if the property is not present.

-			hdmi_mux_0: endpoint {
-				remote-endpoint = <&ipu1_di0_hdmi>;
-			};
-		};
+- interrupts: Reference to the DWC HDMI TX interrupt.

-		port@1 {
-			reg = <1>;
+- clocks: References to all the clocks specified in the clock-names property
+  as specified in Documentation/devicetree/bindings/clock/clock-bindings.txt.

-			hdmi_mux_1: endpoint {
-				remote-endpoint = <&ipu1_di1_hdmi>;
-			};
-		};
-	};
+- clock-names: The DWC HDMI TX uses the following clocks.
+
+  - "iahb" is the bus clock for either AHB and APB (mandatory).
+  - "isfr" is the internal register configuration clock (mandatory).
+  - "cec" is the HDMI CEC controller main clock (optional).
+
+- ports: The connectivity of the DWC HDMI TX with the rest of the system is
+  expressed in using ports as specified in the device graph bindings defined
+  in Documentation/devicetree/bindings/graph.txt. The numbering of the ports
+  is platform-specific.
--- a/Documentation/devicetree/bindings/display/bridge/ti,ths8135.txt
+++ b/Documentation/devicetree/bindings/display/bridge/ti,ths8135.txt
@ -0,0 +1,46 @@
+THS8135 Video DAC
+-----------------
+
+This is the binding for Texas Instruments THS8135 Video DAC bridge.
+
+Required properties:
+
+- compatible: Must be "ti,ths8135"
+
+Required nodes:
+
+This device has two video ports. Their connections are modelled using the OF
+graph bindings specified in Documentation/devicetree/bindings/graph.txt.
+
+- Video port 0 for RGB input
+- Video port 1 for VGA output
+
+Example
+-------
+
+vga-bridge {
+	compatible = "ti,ths8135";
+	#address-cells = <1>;
+	#size-cells = <0>;
+
+	ports {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		port@0 {
+			reg = <0>;
+
+			vga_bridge_in: endpoint {
+				remote-endpoint = <&lcdc_out_vga>;
+			};
+		};
+
+		port@1 {
+			reg = <1>;
+
+			vga_bridge_out: endpoint {
+				remote-endpoint = <&vga_con_in>;
+			};
+		};
+	};
+};
--- a/Documentation/devicetree/bindings/display/hisilicon/hisi-ade.txt
+++ b/Documentation/devicetree/bindings/display/hisilicon/hisi-ade.txt
@ -16,7 +16,7 @@ Required properties:
  "clk_ade_core" for the ADE core clock.
  "clk_codec_jpeg" for the media NOC QoS clock, which use the same clock with
  jpeg codec.
-  "clk_ade_pix" for the ADE pixel clok.
+  "clk_ade_pix" for the ADE pixel clock.
 - assigned-clocks: Should contain "clk_ade_core" and "clk_codec_jpeg" clocks'
  phandle + clock-specifier pairs.
 - assigned-clock-rates: clock rates, one for each entry in assigned-clocks.
--- a/Documentation/devicetree/bindings/display/imx/hdmi.txt
+++ b/Documentation/devicetree/bindings/display/imx/hdmi.txt
@ -1,29 +1,36 @@
-Device-Tree bindings for HDMI Transmitter
+Freescale i.MX6 DWC HDMI TX Encoder
+===================================

-HDMI Transmitter
-================
+The HDMI transmitter is a Synopsys DesignWare HDMI 1.4 TX controller IP
+with a companion PHY IP.
+
+These DT bindings follow the Synopsys DWC HDMI TX bindings defined in
+Documentation/devicetree/bindings/display/bridge/dw_hdmi.txt with the
+following device-specific properties.

-The HDMI Transmitter is a Synopsys DesignWare HDMI 1.4 TX controller IP
-with accompanying PHY IP.

 Required properties:
- - #address-cells : should be <1>
- - #size-cells : should be <0>
- - compatible : should be "fsl,imx6q-hdmi" or "fsl,imx6dl-hdmi".
- - gpr : should be <&gpr>.
-   The phandle points to the iomuxc-gpr region containing the HDMI
-   multiplexer control register.
- - clocks, clock-names : phandles to the HDMI iahb and isrf clocks, as described
-   in Documentation/devicetree/bindings/clock/clock-bindings.txt and
-   Documentation/devicetree/bindings/clock/imx6q-clock.txt.
- - port@[0-4]: Up to four port nodes with endpoint definitions as defined in
-   Documentation/devicetree/bindings/media/video-interfaces.txt,
-   corresponding to the four inputs to the HDMI multiplexer.

-Optional properties:
- - ddc-i2c-bus: phandle of an I2C controller used for DDC EDID probing
+- compatible : Shall be one of "fsl,imx6q-hdmi" or "fsl,imx6dl-hdmi".
+- reg: See dw_hdmi.txt.
+- interrupts: HDMI interrupt number
+- clocks: See dw_hdmi.txt.
+- clock-names: Shall contain "iahb" and "isfr" as defined in dw_hdmi.txt.
+- ports: See dw_hdmi.txt. The DWC HDMI shall have between one and four ports,
+  numbered 0 to 3, corresponding to the four inputs of the HDMI multiplexer.
+  Each port shall have a single endpoint.
+- gpr : Shall contain a phandle to the iomuxc-gpr region containing the HDMI
+  multiplexer control register.

-example:
+Optional properties
+
+- ddc-i2c-bus: The HDMI DDC bus can be connected to either a system I2C master
+  or the functionally-reduced I2C master contained in the DWC HDMI. When
+  connected to a system I2C master this property contains a phandle to that
+  I2C master controller.
+
+
+Example:

 	gpr: iomuxc-gpr@020e0000 {
 		/* ... */
--- a/Documentation/devicetree/bindings/display/msm/gpu.txt
+++ b/Documentation/devicetree/bindings/display/msm/gpu.txt
@ -1,23 +1,19 @@
 Qualcomm adreno/snapdragon GPU

 Required properties:
- compatible: "qcom,adreno-3xx"
+- compatible: "qcom,adreno-XYZ.W", "qcom,adreno"
+    for example: "qcom,adreno-306.0", "qcom,adreno"
+  Note that you need to list the less specific "qcom,adreno" (since this
+  is what the device is matched on), in addition to the more specific
+  with the chip-id.
 - reg: Physical base address and length of the controller's registers.
 - interrupts: The interrupt signal from the gpu.
 - clocks: device clocks
  See ../clocks/clock-bindings.txt for details.
 - clock-names: the following clocks are required:
-  * "core_clk"
-  * "iface_clk"
-  * "mem_iface_clk"
- qcom,chipid: gpu chip-id.  Note this may become optional for future
-  devices if we can reliably read the chipid from hw
- qcom,gpu-pwrlevels: list of operating points
-  - compatible: "qcom,gpu-pwrlevels"
-  - for each qcom,gpu-pwrlevel:
-    - qcom,gpu-freq: requested gpu clock speed
-    - NOTE: downstream android driver defines additional parameters to
-      configure memory bandwidth scaling per OPP.
+  * "core"
+  * "iface"
+  * "mem_iface"

 Example:

@ -25,28 +21,18 @@ Example:
 	...

 	gpu: qcom,kgsl-3d0@4300000 {
-		compatible = "qcom,adreno-3xx";
+		compatible = "qcom,adreno-320.2", "qcom,adreno";
 		reg = <0x04300000 0x20000>;
 		reg-names = "kgsl_3d0_reg_memory";
 		interrupts = <GIC_SPI 80 0>;
 		interrupt-names = "kgsl_3d0_irq";
 		clock-names =
-		    "core_clk",
-		    "iface_clk",
-		    "mem_iface_clk";
+		    "core",
+		    "iface",
+		    "mem_iface";
 		clocks =
 		    <&mmcc GFX3D_CLK>,
 		    <&mmcc GFX3D_AHB_CLK>,
 		    <&mmcc MMSS_IMEM_AHB_CLK>;
-		qcom,chipid = <0x03020100>;
-		qcom,gpu-pwrlevels {
-			compatible = "qcom,gpu-pwrlevels";
-			qcom,gpu-pwrlevel@0 {
-				qcom,gpu-freq = <450000000>;
-			};
-			qcom,gpu-pwrlevel@1 {
-				qcom,gpu-freq = <27000000>;
-			};
-		};
 	};
 };
--- a/Documentation/devicetree/bindings/display/multi-inno,mi0283qt.txt
+++ b/Documentation/devicetree/bindings/display/multi-inno,mi0283qt.txt
@ -0,0 +1,27 @@
+Multi-Inno MI0283QT display panel
+
+Required properties:
+- compatible:	"multi-inno,mi0283qt".
+
+The node for this driver must be a child node of a SPI controller, hence
+all mandatory properties described in ../spi/spi-bus.txt must be specified.
+
+Optional properties:
+- dc-gpios:	D/C pin. The presence/absence of this GPIO determines
+		the panel interface mode (IM[3:0] pins):
+		- present: IM=x110 4-wire 8-bit data serial interface
+		- absent:  IM=x101 3-wire 9-bit data serial interface
+- reset-gpios:	Reset pin
+- power-supply:	A regulator node for the supply voltage.
+- backlight:	phandle of the backlight device attached to the panel
+- rotation:	panel rotation in degrees counter clockwise (0,90,180,270)
+
+Example:
+	mi0283qt@0{
+		compatible = "multi-inno,mi0283qt";
+		reg = <0>;
+		spi-max-frequency = <32000000>;
+		rotation = <90>;
+		dc-gpios = <&gpio 25 0>;
+		backlight = <&backlight>;
+	};
--- a/Documentation/devicetree/bindings/display/panel/boe,nv101wxmn51.txt
+++ b/Documentation/devicetree/bindings/display/panel/boe,nv101wxmn51.txt
@ -0,0 +1,7 @@
+BOE OPTOELECTRONICS TECHNOLOGY 10.1" WXGA TFT LCD panel
+
+Required properties:
+- compatible: should be "boe,nv101wxmn51"
+
+This binding is compatible with the simple-panel binding, which is specified
+in simple-panel.txt in this directory.
--- a/Documentation/devicetree/bindings/display/panel/netron-dy,e231732.txt
+++ b/Documentation/devicetree/bindings/display/panel/netron-dy,e231732.txt
@ -0,0 +1,7 @@
+Netron-DY E231732 7.0" WSVGA TFT LCD panel
+
+Required properties:
+- compatible: should be "netron-dy,e231732"
+
+This binding is compatible with the simple-panel binding, which is specified
+in simple-panel.txt in this directory.
--- a/Documentation/devicetree/bindings/display/panel/panel.txt
+++ b/Documentation/devicetree/bindings/display/panel/panel.txt
@ -0,0 +1,4 @@
+Common display properties
+-------------------------
+
+- rotation:	Display rotation in degrees counter clockwise (0,90,180,270)
--- a/Documentation/devicetree/bindings/display/panel/tianma,tm070jdhg30.txt
+++ b/Documentation/devicetree/bindings/display/panel/tianma,tm070jdhg30.txt
@ -0,0 +1,7 @@
+Tianma Micro-electronics TM070JDHG30 7.0" WXGA TFT LCD panel
+
+Required properties:
+- compatible: should be "tianma,tm070jdhg30"
+
+This binding is compatible with the simple-panel binding, which is specified
+in simple-panel.txt in this directory.
--- a/Documentation/devicetree/bindings/display/rockchip/dw_hdmi-rockchip.txt
+++ b/Documentation/devicetree/bindings/display/rockchip/dw_hdmi-rockchip.txt
@ -1,24 +1,39 @@
-Rockchip specific extensions to the Synopsys Designware HDMI
-================================
+Rockchip DWC HDMI TX Encoder
+============================
+
+The HDMI transmitter is a Synopsys DesignWare HDMI 1.4 TX controller IP
+with a companion PHY IP.
+
+These DT bindings follow the Synopsys DWC HDMI TX bindings defined in
+Documentation/devicetree/bindings/display/bridge/dw_hdmi.txt with the
+following device-specific properties.
+

 Required properties:
- compatible: "rockchip,rk3288-dw-hdmi";
- reg: Physical base address and length of the controller's registers.
- clocks: phandle to hdmi iahb and isfr clocks.
- clock-names: should be "iahb" "isfr"
- rockchip,grf: this soc should set GRF regs to mux vopl/vopb.
+
+- compatible: Shall contain "rockchip,rk3288-dw-hdmi".
+- reg: See dw_hdmi.txt.
+- reg-io-width: See dw_hdmi.txt. Shall be 4.
 - interrupts: HDMI interrupt number
- ports: contain a port node with endpoint definitions as defined in
-  Documentation/devicetree/bindings/media/video-interfaces.txt. For
-  vopb,set the reg = <0> and set the reg = <1> for vopl.
- reg-io-width: the width of the reg:1,4, the value should be 4 on
-  rk3288 platform
+- clocks: See dw_hdmi.txt.
+- clock-names: Shall contain "iahb" and "isfr" as defined in dw_hdmi.txt.
+- ports: See dw_hdmi.txt. The DWC HDMI shall have a single port numbered 0
+  corresponding to the video input of the controller. The port shall have two
+  endpoints, numbered 0 and 1, connected respectively to the vopb and vopl.
+- rockchip,grf: Shall reference the GRF to mux vopl/vopb.

 Optional properties
- ddc-i2c-bus: phandle of an I2C controller used for DDC EDID probing
- clocks, clock-names: phandle to the HDMI CEC clock, name should be "cec"
+
+- ddc-i2c-bus: The HDMI DDC bus can be connected to either a system I2C master
+  or the functionally-reduced I2C master contained in the DWC HDMI. When
+  connected to a system I2C master this property contains a phandle to that
+  I2C master controller.
+- clock-names: See dw_hdmi.txt. The "cec" clock is optional.
+- clock-names: May contain "cec" as defined in dw_hdmi.txt.
+

 Example:
+
 hdmi: hdmi@ff980000 {
 	compatible = "rockchip,rk3288-dw-hdmi";
 	reg = <0xff980000 0x20000>;
--- a/Documentation/devicetree/bindings/display/zte,vou.txt
+++ b/Documentation/devicetree/bindings/display/zte,vou.txt
@ -49,6 +49,15 @@ Required properties:
 	"osc_clk"
 	"xclk"

+* TV Encoder output device
+
+Required properties:
+ - compatible: should be "zte,zx296718-tvenc"
+ - reg: Physical base address and length of the TVENC device IO region
+ - zte,tvenc-power-control: the phandle to SYSCTRL block followed by two
+   integer cells.  The first cell is the offset of SYSCTRL register used
+   to control TV Encoder DAC power, and the second cell is the bit mask.
+
 Example:

 vou: vou@1440000 {
@ -81,4 +90,10 @@ vou: vou@1440000 {
 			 <&topcrm HDMI_XCLK>;
 		clock-names = "osc_cec", "osc_clk", "xclk";
 	};
+
+	tvenc: tvenc@2000 {
+		compatible = "zte,zx296718-tvenc";
+		reg = <0x2000 0x1000>;
+		zte,tvenc-power-control = <&sysctrl 0x170 0x10>;
+	};
 };
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@ -195,6 +195,7 @@ mpl	MPL AG
 mqmaker	mqmaker Inc.
 msi	Micro-Star International Co. Ltd.
 mti	Imagination Technologies Ltd. (formerly MIPS Technologies Inc.)
+multi-inno	Multi-Inno Technology Co.,Ltd
 mundoreader	Mundo Reader S.L.
 murata	Murata Manufacturing Co., Ltd.
 mxicy	Macronix International Co., Ltd.
@ -204,6 +205,7 @@ nec	NEC LCD Technologies, Ltd.
 neonode		Neonode Inc.
 netgear	NETGEAR
 netlogic	Broadcom Corporation (formerly NetLogic Microsystems)
+netron-dy	Netron DY
 netxeon		Shenzhen Netxeon Technology CO., LTD
 nexbox	Nexbox
 newhaven	Newhaven Display International
@ -305,6 +307,7 @@ technologic	Technologic Systems
 terasic	Terasic Inc.
 thine	THine Electronics, Inc.
 ti	Texas Instruments
+tianma	Tianma Micro-electronics Co., Ltd.
 tlm	Trusted Logic Mobility
 topeet  Topeet
 toradex	Toradex AG
--- a/Documentation/dma-buf-sharing.txt
+++ b/Documentation/dma-buf-sharing.txt
@ -1,482 +0,0 @@
-                    DMA Buffer Sharing API Guide
-                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-                            Sumit Semwal
-                <sumit dot semwal at linaro dot org>
-                 <sumit dot semwal at ti dot com>
-
-This document serves as a guide to device-driver writers on what is the dma-buf
-buffer sharing API, how to use it for exporting and using shared buffers.
-
-Any device driver which wishes to be a part of DMA buffer sharing, can do so as
-either the 'exporter' of buffers, or the 'user' of buffers.
-
-Say a driver A wants to use buffers created by driver B, then we call B as the
-exporter, and A as buffer-user.
-
-The exporter
- implements and manages operations[1] for the buffer
- allows other users to share the buffer by using dma_buf sharing APIs,
- manages the details of buffer allocation,
- decides about the actual backing storage where this allocation happens,
- takes care of any migration of scatterlist - for all (shared) users of this
-   buffer,
-
-The buffer-user
- is one of (many) sharing users of the buffer.
- doesn't need to worry about how the buffer is allocated, or where.
- needs a mechanism to get access to the scatterlist that makes up this buffer
-   in memory, mapped into its own address space, so it can access the same area
-   of memory.
-
-dma-buf operations for device dma only
--------------------------------------
-
-The dma_buf buffer sharing API usage contains the following steps:
-
-1. Exporter announces that it wishes to export a buffer
-2. Userspace gets the file descriptor associated with the exported buffer, and
-   passes it around to potential buffer-users based on use case
-3. Each buffer-user 'connects' itself to the buffer
-4. When needed, buffer-user requests access to the buffer from exporter
-5. When finished with its use, the buffer-user notifies end-of-DMA to exporter
-6. when buffer-user is done using this buffer completely, it 'disconnects'
-   itself from the buffer.
-
-
-1. Exporter's announcement of buffer export
-
-   The buffer exporter announces its wish to export a buffer. In this, it
-   connects its own private buffer data, provides implementation for operations
-   that can be performed on the exported dma_buf, and flags for the file
-   associated with this buffer. All these fields are filled in struct
-   dma_buf_export_info, defined via the DEFINE_DMA_BUF_EXPORT_INFO macro.
-
-   Interface:
-      DEFINE_DMA_BUF_EXPORT_INFO(exp_info)
-      struct dma_buf *dma_buf_export(struct dma_buf_export_info *exp_info)
-
-   If this succeeds, dma_buf_export allocates a dma_buf structure, and
-   returns a pointer to the same. It also associates an anonymous file with this
-   buffer, so it can be exported. On failure to allocate the dma_buf object,
-   it returns NULL.
-
-   'exp_name' in struct dma_buf_export_info is the name of exporter - to
-   facilitate information while debugging. It is set to KBUILD_MODNAME by
-   default, so exporters don't have to provide a specific name, if they don't
-   wish to.
-
-   DEFINE_DMA_BUF_EXPORT_INFO macro defines the struct dma_buf_export_info,
-   zeroes it out and pre-populates exp_name in it.
-
-
-2. Userspace gets a handle to pass around to potential buffer-users
-
-   Userspace entity requests for a file-descriptor (fd) which is a handle to the
-   anonymous file associated with the buffer. It can then share the fd with other
-   drivers and/or processes.
-
-   Interface:
-      int dma_buf_fd(struct dma_buf *dmabuf, int flags)
-
-   This API installs an fd for the anonymous file associated with this buffer;
-   returns either 'fd', or error.
-
-3. Each buffer-user 'connects' itself to the buffer
-
-   Each buffer-user now gets a reference to the buffer, using the fd passed to
-   it.
-
-   Interface:
-      struct dma_buf *dma_buf_get(int fd)
-
-   This API will return a reference to the dma_buf, and increment refcount for
-   it.
-
-   After this, the buffer-user needs to attach its device with the buffer, which
-   helps the exporter to know of device buffer constraints.
-
-   Interface:
-      struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
-                                                struct device *dev)
-
-   This API returns reference to an attachment structure, which is then used
-   for scatterlist operations. It will optionally call the 'attach' dma_buf
-   operation, if provided by the exporter.
-
-   The dma-buf sharing framework does the bookkeeping bits related to managing
-   the list of all attachments to a buffer.
-
-Until this stage, the buffer-exporter has the option to choose not to actually
-allocate the backing storage for this buffer, but wait for the first buffer-user
-to request use of buffer for allocation.
-
-
-4. When needed, buffer-user requests access to the buffer
-
-   Whenever a buffer-user wants to use the buffer for any DMA, it asks for
-   access to the buffer using dma_buf_map_attachment API. At least one attach to
-   the buffer must have happened before map_dma_buf can be called.
-
-   Interface:
-      struct sg_table * dma_buf_map_attachment(struct dma_buf_attachment *,
-                                         enum dma_data_direction);
-
-   This is a wrapper to dma_buf->ops->map_dma_buf operation, which hides the
-   "dma_buf->ops->" indirection from the users of this interface.
-
-   In struct dma_buf_ops, map_dma_buf is defined as
-      struct sg_table * (*map_dma_buf)(struct dma_buf_attachment *,
-                                                enum dma_data_direction);
-
-   It is one of the buffer operations that must be implemented by the exporter.
-   It should return the sg_table containing scatterlist for this buffer, mapped
-   into caller's address space.
-
-   If this is being called for the first time, the exporter can now choose to
-   scan through the list of attachments for this buffer, collate the requirements
-   of the attached devices, and choose an appropriate backing storage for the
-   buffer.
-
-   Based on enum dma_data_direction, it might be possible to have multiple users
-   accessing at the same time (for reading, maybe), or any other kind of sharing
-   that the exporter might wish to make available to buffer-users.
-
-   map_dma_buf() operation can return -EINTR if it is interrupted by a signal.
-
-
-5. When finished, the buffer-user notifies end-of-DMA to exporter
-
-   Once the DMA for the current buffer-user is over, it signals 'end-of-DMA' to
-   the exporter using the dma_buf_unmap_attachment API.
-
-   Interface:
-      void dma_buf_unmap_attachment(struct dma_buf_attachment *,
-                                    struct sg_table *);
-
-   This is a wrapper to dma_buf->ops->unmap_dma_buf() operation, which hides the
-   "dma_buf->ops->" indirection from the users of this interface.
-
-   In struct dma_buf_ops, unmap_dma_buf is defined as
-      void (*unmap_dma_buf)(struct dma_buf_attachment *,
-                            struct sg_table *,
-                            enum dma_data_direction);
-
-   unmap_dma_buf signifies the end-of-DMA for the attachment provided. Like
-   map_dma_buf, this API also must be implemented by the exporter.
-
-
-6. when buffer-user is done using this buffer, it 'disconnects' itself from the
-   buffer.
-
-   After the buffer-user has no more interest in using this buffer, it should
-   disconnect itself from the buffer:
-
-   - it first detaches itself from the buffer.
-
-   Interface:
-      void dma_buf_detach(struct dma_buf *dmabuf,
-                          struct dma_buf_attachment *dmabuf_attach);
-
-   This API removes the attachment from the list in dmabuf, and optionally calls
-   dma_buf->ops->detach(), if provided by exporter, for any housekeeping bits.
-
-   - Then, the buffer-user returns the buffer reference to exporter.
-
-   Interface:
-     void dma_buf_put(struct dma_buf *dmabuf);
-
-   This API then reduces the refcount for this buffer.
-
-   If, as a result of this call, the refcount becomes 0, the 'release' file
-   operation related to this fd is called. It calls the dmabuf->ops->release()
-   operation in turn, and frees the memory allocated for dmabuf when exported.
-
-NOTES:
- Importance of attach-detach and {map,unmap}_dma_buf operation pairs
-   The attach-detach calls allow the exporter to figure out backing-storage
-   constraints for the currently-interested devices. This allows preferential
-   allocation, and/or migration of pages across different types of storage
-   available, if possible.
-
-   Bracketing of DMA access with {map,unmap}_dma_buf operations is essential
-   to allow just-in-time backing of storage, and migration mid-way through a
-   use-case.
-
- Migration of backing storage if needed
-   If after
-   - at least one map_dma_buf has happened,
-   - and the backing storage has been allocated for this buffer,
-   another new buffer-user intends to attach itself to this buffer, it might
-   be allowed, if possible for the exporter.
-
-   In case it is allowed by the exporter:
-    if the new buffer-user has stricter 'backing-storage constraints', and the
-    exporter can handle these constraints, the exporter can just stall on the
-    map_dma_buf until all outstanding access is completed (as signalled by
-    unmap_dma_buf).
-    Once all users have finished accessing and have unmapped this buffer, the
-    exporter could potentially move the buffer to the stricter backing-storage,
-    and then allow further {map,unmap}_dma_buf operations from any buffer-user
-    from the migrated backing-storage.
-
-   If the exporter cannot fulfill the backing-storage constraints of the new
-   buffer-user device as requested, dma_buf_attach() would return an error to
-   denote non-compatibility of the new buffer-sharing request with the current
-   buffer.
-
-   If the exporter chooses not to allow an attach() operation once a
-   map_dma_buf() API has been called, it simply returns an error.
-
-Kernel cpu access to a dma-buf buffer object
--------------------------------------------
-
-The motivation to allow cpu access from the kernel to a dma-buf object from the
-importers side are:
- fallback operations, e.g. if the devices is connected to a usb bus and the
-  kernel needs to shuffle the data around first before sending it away.
- full transparency for existing users on the importer side, i.e. userspace
-  should not notice the difference between a normal object from that subsystem
-  and an imported one backed by a dma-buf. This is really important for drm
-  opengl drivers that expect to still use all the existing upload/download
-  paths.
-
-Access to a dma_buf from the kernel context involves three steps:
-
-1. Prepare access, which invalidate any necessary caches and make the object
-   available for cpu access.
-2. Access the object page-by-page with the dma_buf map apis
-3. Finish access, which will flush any necessary cpu caches and free reserved
-   resources.
-
-1. Prepare access
-
-   Before an importer can access a dma_buf object with the cpu from the kernel
-   context, it needs to notify the exporter of the access that is about to
-   happen.
-
-   Interface:
-      int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
-				   enum dma_data_direction direction)
-
-   This allows the exporter to ensure that the memory is actually available for
-   cpu access - the exporter might need to allocate or swap-in and pin the
-   backing storage. The exporter also needs to ensure that cpu access is
-   coherent for the access direction. The direction can be used by the exporter
-   to optimize the cache flushing, i.e. access with a different direction (read
-   instead of write) might return stale or even bogus data (e.g. when the
-   exporter needs to copy the data to temporary storage).
-
-   This step might fail, e.g. in oom conditions.
-
-2. Accessing the buffer
-
-   To support dma_buf objects residing in highmem cpu access is page-based using
-   an api similar to kmap. Accessing a dma_buf is done in aligned chunks of
-   PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which returns
-   a pointer in kernel virtual address space. Afterwards the chunk needs to be
-   unmapped again. There is no limit on how often a given chunk can be mapped
-   and unmapped, i.e. the importer does not need to call begin_cpu_access again
-   before mapping the same chunk again.
-
-   Interfaces:
-      void *dma_buf_kmap(struct dma_buf *, unsigned long);
-      void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
-
-   There are also atomic variants of these interfaces. Like for kmap they
-   facilitate non-blocking fast-paths. Neither the importer nor the exporter (in
-   the callback) is allowed to block when using these.
-
-   Interfaces:
-      void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
-      void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
-
-   For importers all the restrictions of using kmap apply, like the limited
-   supply of kmap_atomic slots. Hence an importer shall only hold onto at most 2
-   atomic dma_buf kmaps at the same time (in any given process context).
-
-   dma_buf kmap calls outside of the range specified in begin_cpu_access are
-   undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed on
-   the partial chunks at the beginning and end but may return stale or bogus
-   data outside of the range (in these partial chunks).
-
-   Note that these calls need to always succeed. The exporter needs to complete
-   any preparations that might fail in begin_cpu_access.
-
-   For some cases the overhead of kmap can be too high, a vmap interface
-   is introduced. This interface should be used very carefully, as vmalloc
-   space is a limited resources on many architectures.
-
-   Interfaces:
-      void *dma_buf_vmap(struct dma_buf *dmabuf)
-      void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr)
-
-   The vmap call can fail if there is no vmap support in the exporter, or if it
-   runs out of vmalloc space. Fallback to kmap should be implemented. Note that
-   the dma-buf layer keeps a reference count for all vmap access and calls down
-   into the exporter's vmap function only when no vmapping exists, and only
-   unmaps it once. Protection against concurrent vmap/vunmap calls is provided
-   by taking the dma_buf->lock mutex.
-
-3. Finish access
-
-   When the importer is done accessing the CPU, it needs to announce this to
-   the exporter (to facilitate cache flushing and unpinning of any pinned
-   resources). The result of any dma_buf kmap calls after end_cpu_access is
-   undefined.
-
-   Interface:
-      void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
-				  enum dma_data_direction dir);
-
-
-Direct Userspace Access/mmap Support
------------------------------------
-
-Being able to mmap an export dma-buf buffer object has 2 main use-cases:
- CPU fallback processing in a pipeline and
- supporting existing mmap interfaces in importers.
-
-1. CPU fallback processing in a pipeline
-
-   In many processing pipelines it is sometimes required that the cpu can access
-   the data in a dma-buf (e.g. for thumbnail creation, snapshots, ...). To avoid
-   the need to handle this specially in userspace frameworks for buffer sharing
-   it's ideal if the dma_buf fd itself can be used to access the backing storage
-   from userspace using mmap.
-
-   Furthermore Android's ION framework already supports this (and is otherwise
-   rather similar to dma-buf from a userspace consumer side with using fds as
-   handles, too). So it's beneficial to support this in a similar fashion on
-   dma-buf to have a good transition path for existing Android userspace.
-
-   No special interfaces, userspace simply calls mmap on the dma-buf fd, making
-   sure that the cache synchronization ioctl (DMA_BUF_IOCTL_SYNC) is *always*
-   used when the access happens. Note that DMA_BUF_IOCTL_SYNC can fail with
-   -EAGAIN or -EINTR, in which case it must be restarted.
-
-   Some systems might need some sort of cache coherency management e.g. when
-   CPU and GPU domains are being accessed through dma-buf at the same time. To
-   circumvent this problem there are begin/end coherency markers, that forward
-   directly to existing dma-buf device drivers vfunc hooks. Userspace can make
-   use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence
-   would be used like following:
-     - mmap dma-buf fd
-     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
-       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
-       want (with the new data being consumed by the GPU or say scanout device)
-     - munmap once you don't need the buffer any more
-
-    For correctness and optimal performance, it is always required to use
-    SYNC_START and SYNC_END before and after, respectively, when accessing the
-    mapped address. Userspace cannot rely on coherent access, even when there
-    are systems where it just works without calling these ioctls.
-
-2. Supporting existing mmap interfaces in importers
-
-   Similar to the motivation for kernel cpu access it is again important that
-   the userspace code of a given importing subsystem can use the same interfaces
-   with a imported dma-buf buffer object as with a native buffer object. This is
-   especially important for drm where the userspace part of contemporary OpenGL,
-   X, and other drivers is huge, and reworking them to use a different way to
-   mmap a buffer rather invasive.
-
-   The assumption in the current dma-buf interfaces is that redirecting the
-   initial mmap is all that's needed. A survey of some of the existing
-   subsystems shows that no driver seems to do any nefarious thing like syncing
-   up with outstanding asynchronous processing on the device or allocating
-   special resources at fault time. So hopefully this is good enough, since
-   adding interfaces to intercept pagefaults and allow pte shootdowns would
-   increase the complexity quite a bit.
-
-   Interface:
-      int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *,
-		       unsigned long);
-
-   If the importing subsystem simply provides a special-purpose mmap call to set
-   up a mapping in userspace, calling do_mmap with dma_buf->file will equally
-   achieve that for a dma-buf object.
-
-3. Implementation notes for exporters
-
-   Because dma-buf buffers have invariant size over their lifetime, the dma-buf
-   core checks whether a vma is too large and rejects such mappings. The
-   exporter hence does not need to duplicate this check.
-
-   Because existing importing subsystems might presume coherent mappings for
-   userspace, the exporter needs to set up a coherent mapping. If that's not
-   possible, it needs to fake coherency by manually shooting down ptes when
-   leaving the cpu domain and flushing caches at fault time. Note that all the
-   dma_buf files share the same anon inode, hence the exporter needs to replace
-   the dma_buf file stored in vma->vm_file with it's own if pte shootdown is
-   required. This is because the kernel uses the underlying inode's address_space
-   for vma tracking (and hence pte tracking at shootdown time with
-   unmap_mapping_range).
-
-   If the above shootdown dance turns out to be too expensive in certain
-   scenarios, we can extend dma-buf with a more explicit cache tracking scheme
-   for userspace mappings. But the current assumption is that using mmap is
-   always a slower path, so some inefficiencies should be acceptable.
-
-   Exporters that shoot down mappings (for any reasons) shall not do any
-   synchronization at fault time with outstanding device operations.
-   Synchronization is an orthogonal issue to sharing the backing storage of a
-   buffer and hence should not be handled by dma-buf itself. This is explicitly
-   mentioned here because many people seem to want something like this, but if
-   different exporters handle this differently, buffer sharing can fail in
-   interesting ways depending upong the exporter (if userspace starts depending
-   upon this implicit synchronization).
-
-Other Interfaces Exposed to Userspace on the dma-buf FD
------------------------------------------------------
-
- Since kernel 3.12 the dma-buf FD supports the llseek system call, but only
-  with offset=0 and whence=SEEK_END|SEEK_SET. SEEK_SET is supported to allow
-  the usual size discover pattern size = SEEK_END(0); SEEK_SET(0). Every other
-  llseek operation will report -EINVAL.
-
-  If llseek on dma-buf FDs isn't support the kernel will report -ESPIPE for all
-  cases. Userspace can use this to detect support for discovering the dma-buf
-  size using llseek.
-
-Miscellaneous notes
-------------------
-
- Any exporters or users of the dma-buf buffer sharing framework must have
-  a 'select DMA_SHARED_BUFFER' in their respective Kconfigs.
-
- In order to avoid fd leaks on exec, the FD_CLOEXEC flag must be set
-  on the file descriptor.  This is not just a resource leak, but a
-  potential security hole.  It could give the newly exec'd application
-  access to buffers, via the leaked fd, to which it should otherwise
-  not be permitted access.
-
-  The problem with doing this via a separate fcntl() call, versus doing it
-  atomically when the fd is created, is that this is inherently racy in a
-  multi-threaded app[3].  The issue is made worse when it is library code
-  opening/creating the file descriptor, as the application may not even be
-  aware of the fd's.
-
-  To avoid this problem, userspace must have a way to request O_CLOEXEC
-  flag be set when the dma-buf fd is created.  So any API provided by
-  the exporting driver to create a dmabuf fd must provide a way to let
-  userspace control setting of O_CLOEXEC flag passed in to dma_buf_fd().
-
- If an exporter needs to manually flush caches and hence needs to fake
-  coherency for mmap support, it needs to be able to zap all the ptes pointing
-  at the backing storage. Now linux mm needs a struct address_space associated
-  with the struct file stored in vma->vm_file to do that with the function
-  unmap_mapping_range. But the dma_buf framework only backs every dma_buf fd
-  with the anon_file struct file, i.e. all dma_bufs share the same file.
-
-  Hence exporters need to setup their own file (and address_space) association
-  by setting vma->vm_file and adjusting vma->vm_pgoff in the dma_buf mmap
-  callback. In the specific case of a gem driver the exporter could use the
-  shmem file already provided by gem (and set vm_pgoff = 0). Exporters can then
-  zap ptes by unmapping the corresponding range of the struct address_space
-  associated with their own file.
-
-References:
-[1] struct dma_buf_ops in include/linux/dma-buf.h
-[2] All interfaces mentioned above defined in include/linux/dma-buf.h
-[3] https://lwn.net/Articles/236486/
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@ -17,6 +17,98 @@ shared or exclusive fence(s) associated with the buffer.
 Shared DMA Buffers
 ------------------

+This document serves as a guide to device-driver writers on what is the dma-buf
+buffer sharing API, how to use it for exporting and using shared buffers.
+
+Any device driver which wishes to be a part of DMA buffer sharing, can do so as
+either the 'exporter' of buffers, or the 'user' or 'importer' of buffers.
+
+Say a driver A wants to use buffers created by driver B, then we call B as the
+exporter, and A as buffer-user/importer.
+
+The exporter
+
+ - implements and manages operations in :c:type:`struct dma_buf_ops
+   <dma_buf_ops>` for the buffer,
+ - allows other users to share the buffer by using dma_buf sharing APIs,
+ - manages the details of buffer allocation, wrapped int a :c:type:`struct
+   dma_buf <dma_buf>`,
+ - decides about the actual backing storage where this allocation happens,
+ - and takes care of any migration of scatterlist - for all (shared) users of
+   this buffer.
+
+The buffer-user
+
+ - is one of (many) sharing users of the buffer.
+ - doesn't need to worry about how the buffer is allocated, or where.
+ - and needs a mechanism to get access to the scatterlist that makes up this
+   buffer in memory, mapped into its own address space, so it can access the
+   same area of memory. This interface is provided by :c:type:`struct
+   dma_buf_attachment <dma_buf_attachment>`.
+
+Any exporters or users of the dma-buf buffer sharing framework must have a
+'select DMA_SHARED_BUFFER' in their respective Kconfigs.
+
+Userspace Interface Notes
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Mostly a DMA buffer file descriptor is simply an opaque object for userspace,
+and hence the generic interface exposed is very minimal. There's a few things to
+consider though:
+
+- Since kernel 3.12 the dma-buf FD supports the llseek system call, but only
+  with offset=0 and whence=SEEK_END|SEEK_SET. SEEK_SET is supported to allow
+  the usual size discover pattern size = SEEK_END(0); SEEK_SET(0). Every other
+  llseek operation will report -EINVAL.
+
+  If llseek on dma-buf FDs isn't support the kernel will report -ESPIPE for all
+  cases. Userspace can use this to detect support for discovering the dma-buf
+  size using llseek.
+
+- In order to avoid fd leaks on exec, the FD_CLOEXEC flag must be set
+  on the file descriptor.  This is not just a resource leak, but a
+  potential security hole.  It could give the newly exec'd application
+  access to buffers, via the leaked fd, to which it should otherwise
+  not be permitted access.
+
+  The problem with doing this via a separate fcntl() call, versus doing it
+  atomically when the fd is created, is that this is inherently racy in a
+  multi-threaded app[3].  The issue is made worse when it is library code
+  opening/creating the file descriptor, as the application may not even be
+  aware of the fd's.
+
+  To avoid this problem, userspace must have a way to request O_CLOEXEC
+  flag be set when the dma-buf fd is created.  So any API provided by
+  the exporting driver to create a dmabuf fd must provide a way to let
+  userspace control setting of O_CLOEXEC flag passed in to dma_buf_fd().
+
+- Memory mapping the contents of the DMA buffer is also supported. See the
+  discussion below on `CPU Access to DMA Buffer Objects`_ for the full details.
+
+- The DMA buffer FD is also pollable, see `Fence Poll Support`_ below for
+  details.
+
+Basic Operation and Device DMA Access
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-buf.c
+   :doc: dma buf device access
+
+CPU Access to DMA Buffer Objects
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-buf.c
+   :doc: cpu access
+
+Fence Poll Support
+~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/dma-buf/dma-buf.c
+   :doc: fence polling
+
+Kernel Functions and Structures Reference
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
 .. kernel-doc:: drivers/dma-buf/dma-buf.c
   :export:

--- a/Documentation/gpu/drm-kms.rst
+++ b/Documentation/gpu/drm-kms.rst
@ -48,11 +48,17 @@ CRTC Abstraction
 ================

 .. kernel-doc:: drivers/gpu/drm/drm_crtc.c
-   :export:
+   :doc: overview
+
+CRTC Functions Reference
+--------------------------------

 .. kernel-doc:: include/drm/drm_crtc.h
   :internal:

+.. kernel-doc:: drivers/gpu/drm/drm_crtc.c
+   :export:
+
 Frame Buffer Abstraction
 ========================

--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@ -34,25 +34,26 @@ TTM initialization
 ------------------

    **Warning**
-
    This section is outdated.

-Drivers wishing to support TTM must fill out a drm_bo_driver
-structure. The structure contains several fields with function pointers
-for initializing the TTM, allocating and freeing memory, waiting for
-command completion and fence synchronization, and memory migration. See
-the radeon_ttm.c file for an example of usage.
+Drivers wishing to support TTM must pass a filled :c:type:`ttm_bo_driver
+<ttm_bo_driver>` structure to ttm_bo_device_init, together with an
+initialized global reference to the memory manager.  The ttm_bo_driver
+structure contains several fields with function pointers for
+initializing the TTM, allocating and freeing memory, waiting for command
+completion and fence synchronization, and memory migration.

-The ttm_global_reference structure is made up of several fields:
+The :c:type:`struct drm_global_reference <drm_global_reference>` is made
+up of several fields:

 .. code-block:: c

-              struct ttm_global_reference {
+              struct drm_global_reference {
                      enum ttm_global_types global_type;
                      size_t size;
                      void *object;
-                      int (*init) (struct ttm_global_reference *);
-                      void (*release) (struct ttm_global_reference *);
+                      int (*init) (struct drm_global_reference *);
+                      void (*release) (struct drm_global_reference *);
              };


@ -76,6 +77,12 @@ ttm_bo_global_release(), respectively. Also, like the previous
 object, ttm_global_item_ref() is used to create an initial reference
 count for the TTM, which will call your initialization function.

+See the radeon_ttm.c file for an example of usage.
+
+.. kernel-doc:: drivers/gpu/drm/drm_global.c
+   :export:
+
+
 The Graphics Execution Manager (GEM)
 ====================================

@ -284,10 +291,17 @@ To use :c:func:`drm_gem_mmap()`, drivers must fill the struct
 :c:type:`struct drm_driver <drm_driver>` gem_vm_ops field
 with a pointer to VM operations.

-struct vm_operations_struct \*gem_vm_ops struct
-vm_operations_struct { void (\*open)(struct vm_area_struct \* area);
-void (\*close)(struct vm_area_struct \* area); int (\*fault)(struct
-vm_area_struct \*vma, struct vm_fault \*vmf); };
+The VM operations is a :c:type:`struct vm_operations_struct <vm_operations_struct>`
+made up of several fields, the more interesting ones being:
+
+.. code-block:: c
+
+	struct vm_operations_struct {
+		void (*open)(struct vm_area_struct * area);
+		void (*close)(struct vm_area_struct * area);
+		int (*fault)(struct vm_fault *vmf);
+	};
+

 The open and close operations must update the GEM object reference
 count. Drivers can use the :c:func:`drm_gem_vm_open()` and
@ -303,6 +317,17 @@ created.
 Drivers that want to map the GEM object upfront instead of handling page
 faults can implement their own mmap file operation handler.

+For platforms without MMU the GEM core provides a helper method
+:c:func:`drm_gem_cma_get_unmapped_area`. The mmap() routines will call
+this to get a proposed address for the mapping.
+
+To use :c:func:`drm_gem_cma_get_unmapped_area`, drivers must fill the
+struct :c:type:`struct file_operations <file_operations>` get_unmapped_area
+field with a pointer on :c:func:`drm_gem_cma_get_unmapped_area`.
+
+More detailed information about get_unmapped_area can be found in
+Documentation/nommu-mmap.txt
+
 Memory Coherency
 ----------------

@ -442,7 +467,7 @@ LRU Scan/Eviction Support
 -------------------------

 .. kernel-doc:: drivers/gpu/drm/drm_mm.c
-   :doc: lru scan roaster
+   :doc: lru scan roster

 DRM MM Range Allocator Function References
 ------------------------------------------
@ -452,3 +477,9 @@ DRM MM Range Allocator Function References

 .. kernel-doc:: include/drm/drm_mm.h
   :internal:
+
+DRM Cache Handling
+==================
+
+.. kernel-doc:: drivers/gpu/drm/drm_cache.c
+   :export:
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@ -156,8 +156,12 @@ other hand, a driver requires shared state between clients which is
 visible to user-space and accessible beyond open-file boundaries, they
 cannot support render nodes.

+
+Testing and validation
+======================
+
 Validating changes with IGT
-===========================
+---------------------------

 There's a collection of tests that aims to cover the whole functionality of
 DRM drivers and that can be used to check that changes to DRM drivers or the
@ -193,6 +197,12 @@ run-tests.sh is a wrapper around piglit that will execute the tests matching
 the -t options. A report in HTML format will be available in
 ./results/html/index.html. Results can be compared with piglit.

+Display CRC Support
+-------------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_debugfs_crc.c
+   :doc: CRC ABI
+
 VBlank event handling
 =====================

@ -209,16 +219,3 @@ DRM_IOCTL_MODESET_CTL
    mode setting, since on many devices the vertical blank counter is
    reset to 0 at some point during modeset. Modern drivers should not
    call this any more since with kernel mode setting it is a no-op.
-
-This second part of the GPU Driver Developer's Guide documents driver
-code, implementation details and also all the driver-specific userspace
-interfaces. Especially since all hardware-acceleration interfaces to
-userspace are driver specific for efficiency and other reasons these
-interfaces can be rather substantial. Hence every driver has its own
-chapter.
-
-Testing and validation
-======================
-
-.. kernel-doc:: drivers/gpu/drm/drm_debugfs_crc.c
-   :doc: CRC ABI
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@ -222,6 +222,18 @@ Video BIOS Table (VBT)
 .. kernel-doc:: drivers/gpu/drm/i915/intel_vbt_defs.h
   :internal:

+Display PLLs
+------------
+
+.. kernel-doc:: drivers/gpu/drm/i915/intel_dpll_mgr.c
+   :doc: Display PLLs
+
+.. kernel-doc:: drivers/gpu/drm/i915/intel_dpll_mgr.c
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/i915/intel_dpll_mgr.h
+   :internal:
+
 Memory Management and Command Submission
 ========================================

@ -365,4 +377,95 @@ switch_mm
 .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
   :doc: switch_mm tracepoint

+Perf
+====
+
+Overview
+--------
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :doc: i915 Perf Overview
+
+Comparison with Core Perf
+-------------------------
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :doc: i915 Perf History and Comparison with Core Perf
+
+i915 Driver Entry Points
+------------------------
+
+This section covers the entrypoints exported outside of i915_perf.c to
+integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_init
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_fini
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_register
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_unregister
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_open_ioctl
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_release
+
+i915 Perf Stream
+----------------
+
+This section covers the stream-semantics-agnostic structures and functions
+for representing an i915 perf stream FD and associated file operations.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_perf_stream
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_perf_stream_ops
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: read_properties_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_open_ioctl_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_destroy_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_read
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_ioctl
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_enable_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_disable_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_poll
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_poll_locked
+
+i915 Perf Observation Architecture Stream
+-----------------------------------------
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_oa_ops
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_init
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_read
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_enable
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_disable
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_wait_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_poll_wait
+
+All i915 Perf Internals
+-----------------------
+
+This section simply includes all currently documented i915 perf internals, in
+no particular order, but may include some more minor utilities or platform
+specific details than found in the more high-level sections.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :internal:
+
 .. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c
--- a/Documentation/gpu/index.rst
+++ b/Documentation/gpu/index.rst
@ -11,6 +11,7 @@ Linux GPU Driver Developer's Guide
   drm-kms-helpers
   drm-uapi
   i915
+   tinydrm
   vga-switcheroo
   vgaarbiter

--- a/Documentation/gpu/introduction.rst
+++ b/Documentation/gpu/introduction.rst
@ -23,13 +23,12 @@ For consistency this documentation uses American English. Abbreviations
 are written as all-uppercase, for example: DRM, KMS, IOCTL, CRTC, and so
 on. To aid in reading, documentations make full use of the markup
 characters kerneldoc provides: @parameter for function parameters,
-@member for structure members, &structure to reference structures and
-function() for functions. These all get automatically hyperlinked if
-kerneldoc for the referenced objects exists. When referencing entries in
-function vtables please use ->vfunc(). Note that kerneldoc does not
-support referencing struct members directly, so please add a reference
-to the vtable struct somewhere in the same paragraph or at least
-section.
+@member for structure members (within the same structure), &struct structure to
+reference structures and function() for functions. These all get automatically
+hyperlinked if kerneldoc for the referenced objects exists. When referencing
+entries in function vtables (and structure members in general) please use
+&vtable_name.vfunc. Unfortunately this does not yet yield a direct link to the
+member, only the structure.

 Except in special situations (to separate locked from unlocked variants)
 locking requirements for functions aren't documented in the kerneldoc.
@ -49,3 +48,5 @@ section name should be all upper-case or not, and whether it should end
 in a colon or not. Go with the file-local style. Other common section
 names are "Notes" with information for dangerous or tricky corner cases,
 and "FIXME" where the interface could be cleaned up.
+
+Also read the :ref:`guidelines for the kernel documentation at large <doc_guide>`.
--- a/Documentation/gpu/tinydrm.rst
+++ b/Documentation/gpu/tinydrm.rst
@ -0,0 +1,42 @@
+==========================
+drm/tinydrm Driver library
+==========================
+
+.. kernel-doc:: drivers/gpu/drm/tinydrm/core/tinydrm-core.c
+   :doc: overview
+
+Core functionality
+==================
+
+.. kernel-doc:: drivers/gpu/drm/tinydrm/core/tinydrm-core.c
+   :doc: core
+
+.. kernel-doc:: include/drm/tinydrm/tinydrm.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/tinydrm/core/tinydrm-core.c
+   :export:
+
+.. kernel-doc:: drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c
+   :export:
+
+Additional helpers
+==================
+
+.. kernel-doc:: include/drm/tinydrm/tinydrm-helpers.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
+   :export:
+
+MIPI DBI Compatible Controllers
+===============================
+
+.. kernel-doc:: drivers/gpu/drm/tinydrm/mipi-dbi.c
+   :doc: overview
+
+.. kernel-doc:: include/drm/tinydrm/mipi-dbi.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/tinydrm/mipi-dbi.c
+   :export:
--- a/Documentation/sound/hd-audio/dp-mst.rst
+++ b/Documentation/sound/hd-audio/dp-mst.rst
@ -19,6 +19,23 @@ PCM
 ===
 To be added

+Pin Initialization
+==================
+Each pin may have several device entries (virtual pins). On Intel platform,
+the device entries number is dynamically changed. If DP MST hub is connected,
+it is in DP MST mode, and the device entries number is 3. Otherwise, the
+device entries number is 1.
+
+To simplify the implementation, all the device entries will be initialized
+when bootup no matter whether it is in DP MST mode or not.
+
+Connection list
+===============
+DP MST reuses connection list code. The code can be reused because
+device entries on the same pin have the same connection list.
+
+This means DP MST gets the device entry connection list without the
+device entry setting.

 Jack
 ====
--- a/9
+++ b/9
@ -4031,7 +4031,7 @@ F:	drivers/dma-buf/
 F:	include/linux/dma-buf*
 F:	include/linux/reservation.h
 F:	include/linux/*fence.h
-F:	Documentation/dma-buf-sharing.txt
+F:	Documentation/driver-api/dma-buf.rst
 T:	git git://anongit.freedesktop.org/drm/drm-misc

 SYNC FILE FRAMEWORK
@ -4041,6 +4041,7 @@ S:	Maintained
 L:	linux-media@vger.kernel.org
 L:	dri-devel@lists.freedesktop.org
 F:	drivers/dma-buf/sync_*
+F:	drivers/dma-buf/dma-fence*
 F:	drivers/dma-buf/sw_sync.c
 F:	include/linux/sync_file.h
 F:	include/uapi/linux/sync_file.h
@ -4315,6 +4316,12 @@ S:	Supported
 F:	drivers/gpu/drm/mediatek/
 F:	Documentation/devicetree/bindings/display/mediatek/

+DRM DRIVER FOR MI0283QT
+M:	Noralf Trønnes <noralf@tronnes.org>
+S:	Maintained
+F:	drivers/gpu/drm/tinydrm/mi0283qt.c
+F:	Documentation/devicetree/bindings/display/multi-inno,mi0283qt.txt
+
 DRM DRIVER FOR MSM ADRENO GPU
 M:	Rob Clark <robdclark@gmail.com>
 L:	linux-arm-msm@vger.kernel.org
--- a/arch/blackfin/include/asm/vga.h
+++ b/arch/blackfin/include/asm/vga.h
@ -0,0 +1 @@
+#include <asm-generic/vga.h>
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@ -1420,8 +1420,10 @@ int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev,
 }
 EXPORT_SYMBOL(intel_gmch_probe);

-void intel_gtt_get(u64 *gtt_total, size_t *stolen_size,
-		   phys_addr_t *mappable_base, u64 *mappable_end)
+void intel_gtt_get(u64 *gtt_total,
+		   u32 *stolen_size,
+		   phys_addr_t *mappable_base,
+		   u64 *mappable_end)
 {
 	*gtt_total = intel_private.gtt_total_entries << PAGE_SHIFT;
 	*stolen_size = intel_private.stolen_size;
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@ -124,6 +124,28 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
 	return base + offset;
 }

+/**
+ * DOC: fence polling
+ *
+ * To support cross-device and cross-driver synchronization of buffer access
+ * implicit fences (represented internally in the kernel with &struct fence) can
+ * be attached to a &dma_buf. The glue for that and a few related things are
+ * provided in the &reservation_object structure.
+ *
+ * Userspace can query the state of these implicitly tracked fences using poll()
+ * and related system calls:
+ *
+ * - Checking for POLLIN, i.e. read access, can be use to query the state of the
+ *   most recent write or exclusive fence.
+ *
+ * - Checking for POLLOUT, i.e. write access, can be used to query the state of
+ *   all attached fences, shared and exclusive ones.
+ *
+ * Note that this only signals the completion of the respective fences, i.e. the
+ * DMA transfers are complete. Cache flushing and any other necessary
+ * preparations before CPU access can begin still need to happen.
+ */
+
 static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
 {
 	struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb;
@ -313,6 +335,37 @@ static inline int is_dma_buf_file(struct file *file)
 	return file->f_op == &dma_buf_fops;
 }

+/**
+ * DOC: dma buf device access
+ *
+ * For device DMA access to a shared DMA buffer the usual sequence of operations
+ * is fairly simple:
+ *
+ * 1. The exporter defines his exporter instance using
+ *    DEFINE_DMA_BUF_EXPORT_INFO() and calls dma_buf_export() to wrap a private
+ *    buffer object into a &dma_buf. It then exports that &dma_buf to userspace
+ *    as a file descriptor by calling dma_buf_fd().
+ *
+ * 2. Userspace passes this file-descriptors to all drivers it wants this buffer
+ *    to share with: First the filedescriptor is converted to a &dma_buf using
+ *    dma_buf_get(). The the buffer is attached to the device using
+ *    dma_buf_attach().
+ *
+ *    Up to this stage the exporter is still free to migrate or reallocate the
+ *    backing storage.
+ *
+ * 3. Once the buffer is attached to all devices userspace can inniate DMA
+ *    access to the shared buffer. In the kernel this is done by calling
+ *    dma_buf_map_attachment() and dma_buf_unmap_attachment().
+ *
+ * 4. Once a driver is done with a shared buffer it needs to call
+ *    dma_buf_detach() (after cleaning up any mappings) and then release the
+ *    reference acquired with dma_buf_get by calling dma_buf_put().
+ *
+ * For the detailed semantics exporters are expected to implement see
+ * &dma_buf_ops.
+ */
+
 /**
 * dma_buf_export - Creates a new dma_buf, and associates an anon file
 * with this buffer, so it can be exported.
@ -320,13 +373,15 @@ static inline int is_dma_buf_file(struct file *file)
 * Additionally, provide a name string for exporter; useful in debugging.
 *
 * @exp_info:	[in]	holds all the export related information provided
- *			by the exporter. see struct dma_buf_export_info
+ *			by the exporter. see &struct dma_buf_export_info
 *			for further details.
 *
 * Returns, on success, a newly created dma_buf object, which wraps the
 * supplied private data and operations for dma_buf_ops. On either missing
 * ops, or error in allocating struct dma_buf, will return negative error.
 *
+ * For most cases the easiest way to create @exp_info is through the
+ * %DEFINE_DMA_BUF_EXPORT_INFO macro.
 */
 struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
 {
@ -458,7 +513,11 @@ EXPORT_SYMBOL_GPL(dma_buf_get);
 * dma_buf_put - decreases refcount of the buffer
 * @dmabuf:	[in]	buffer to reduce refcount of
 *
- * Uses file's refcounting done implicitly by fput()
+ * Uses file's refcounting done implicitly by fput().
+ *
+ * If, as a result of this call, the refcount becomes 0, the 'release' file
+ * operation related to this fd is called. It calls &dma_buf_ops.release vfunc
+ * in turn, and frees the memory allocated for dmabuf when exported.
 */
 void dma_buf_put(struct dma_buf *dmabuf)
 {
@ -475,8 +534,17 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
 * @dmabuf:	[in]	buffer to attach device to.
 * @dev:	[in]	device to be attached.
 *
- * Returns struct dma_buf_attachment * for this attachment; returns ERR_PTR on
- * error.
+ * Returns struct dma_buf_attachment pointer for this attachment. Attachments
+ * must be cleaned up by calling dma_buf_detach().
+ *
+ * Returns:
+ *
+ * A pointer to newly created &dma_buf_attachment on success, or a negative
+ * error code wrapped into a pointer on failure.
+ *
+ * Note that this can fail if the backing storage of @dmabuf is in a place not
+ * accessible to @dev, and cannot be moved to a more suitable place. This is
+ * indicated with the error code -EBUSY.
 */
 struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
 					  struct device *dev)
@ -519,6 +587,7 @@ EXPORT_SYMBOL_GPL(dma_buf_attach);
 * @dmabuf:	[in]	buffer to detach from.
 * @attach:	[in]	attachment to be detached; is free'd after this call.
 *
+ * Clean up a device attachment obtained by calling dma_buf_attach().
 */
 void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
 {
@ -543,7 +612,12 @@ EXPORT_SYMBOL_GPL(dma_buf_detach);
 * @direction:	[in]	direction of DMA transfer
 *
 * Returns sg_table containing the scatterlist to be returned; returns ERR_PTR
- * on error.
+ * on error. May return -EINTR if it is interrupted by a signal.
+ *
+ * A mapping must be unmapped again using dma_buf_map_attachment(). Note that
+ * the underlying backing storage is pinned for as long as a mapping exists,
+ * therefore users/importers should not hold onto a mapping for undue amounts of
+ * time.
 */
 struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
 					enum dma_data_direction direction)
@ -571,6 +645,7 @@ EXPORT_SYMBOL_GPL(dma_buf_map_attachment);
 * @sg_table:	[in]	scatterlist info of the buffer to unmap
 * @direction:  [in]    direction of DMA transfer
 *
+ * This unmaps a DMA mapping for @attached obtained by dma_buf_map_attachment().
 */
 void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
 				struct sg_table *sg_table,
@ -586,6 +661,122 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
 }
 EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);

+/**
+ * DOC: cpu access
+ *
+ * There are mutliple reasons for supporting CPU access to a dma buffer object:
+ *
+ * - Fallback operations in the kernel, for example when a device is connected
+ *   over USB and the kernel needs to shuffle the data around first before
+ *   sending it away. Cache coherency is handled by braketing any transactions
+ *   with calls to dma_buf_begin_cpu_access() and dma_buf_end_cpu_access()
+ *   access.
+ *
+ *   To support dma_buf objects residing in highmem cpu access is page-based
+ *   using an api similar to kmap. Accessing a dma_buf is done in aligned chunks
+ *   of PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which
+ *   returns a pointer in kernel virtual address space. Afterwards the chunk
+ *   needs to be unmapped again. There is no limit on how often a given chunk
+ *   can be mapped and unmapped, i.e. the importer does not need to call
+ *   begin_cpu_access again before mapping the same chunk again.
+ *
+ *   Interfaces::
+ *      void \*dma_buf_kmap(struct dma_buf \*, unsigned long);
+ *      void dma_buf_kunmap(struct dma_buf \*, unsigned long, void \*);
+ *
+ *   There are also atomic variants of these interfaces. Like for kmap they
+ *   facilitate non-blocking fast-paths. Neither the importer nor the exporter
+ *   (in the callback) is allowed to block when using these.
+ *
+ *   Interfaces::
+ *      void \*dma_buf_kmap_atomic(struct dma_buf \*, unsigned long);
+ *      void dma_buf_kunmap_atomic(struct dma_buf \*, unsigned long, void \*);
+ *
+ *   For importers all the restrictions of using kmap apply, like the limited
+ *   supply of kmap_atomic slots. Hence an importer shall only hold onto at
+ *   max 2 atomic dma_buf kmaps at the same time (in any given process context).
+ *
+ *   dma_buf kmap calls outside of the range specified in begin_cpu_access are
+ *   undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed on
+ *   the partial chunks at the beginning and end but may return stale or bogus
+ *   data outside of the range (in these partial chunks).
+ *
+ *   Note that these calls need to always succeed. The exporter needs to
+ *   complete any preparations that might fail in begin_cpu_access.
+ *
+ *   For some cases the overhead of kmap can be too high, a vmap interface
+ *   is introduced. This interface should be used very carefully, as vmalloc
+ *   space is a limited resources on many architectures.
+ *
+ *   Interfaces::
+ *      void \*dma_buf_vmap(struct dma_buf \*dmabuf)
+ *      void dma_buf_vunmap(struct dma_buf \*dmabuf, void \*vaddr)
+ *
+ *   The vmap call can fail if there is no vmap support in the exporter, or if
+ *   it runs out of vmalloc space. Fallback to kmap should be implemented. Note
+ *   that the dma-buf layer keeps a reference count for all vmap access and
+ *   calls down into the exporter's vmap function only when no vmapping exists,
+ *   and only unmaps it once. Protection against concurrent vmap/vunmap calls is
+ *   provided by taking the dma_buf->lock mutex.
+ *
+ * - For full compatibility on the importer side with existing userspace
+ *   interfaces, which might already support mmap'ing buffers. This is needed in
+ *   many processing pipelines (e.g. feeding a software rendered image into a
+ *   hardware pipeline, thumbnail creation, snapshots, ...). Also, Android's ION
+ *   framework already supported this and for DMA buffer file descriptors to
+ *   replace ION buffers mmap support was needed.
+ *
+ *   There is no special interfaces, userspace simply calls mmap on the dma-buf
+ *   fd. But like for CPU access there's a need to braket the actual access,
+ *   which is handled by the ioctl (DMA_BUF_IOCTL_SYNC). Note that
+ *   DMA_BUF_IOCTL_SYNC can fail with -EAGAIN or -EINTR, in which case it must
+ *   be restarted.
+ *
+ *   Some systems might need some sort of cache coherency management e.g. when
+ *   CPU and GPU domains are being accessed through dma-buf at the same time.
+ *   To circumvent this problem there are begin/end coherency markers, that
+ *   forward directly to existing dma-buf device drivers vfunc hooks. Userspace
+ *   can make use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
+ *   sequence would be used like following:
+ *
+ *     - mmap dma-buf fd
+ *     - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
+ *       to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
+ *       want (with the new data being consumed by say the GPU or the scanout
+ *       device)
+ *     - munmap once you don't need the buffer any more
+ *
+ *    For correctness and optimal performance, it is always required to use
+ *    SYNC_START and SYNC_END before and after, respectively, when accessing the
+ *    mapped address. Userspace cannot rely on coherent access, even when there
+ *    are systems where it just works without calling these ioctls.
+ *
+ * - And as a CPU fallback in userspace processing pipelines.
+ *
+ *   Similar to the motivation for kernel cpu access it is again important that
+ *   the userspace code of a given importing subsystem can use the same
+ *   interfaces with a imported dma-buf buffer object as with a native buffer
+ *   object. This is especially important for drm where the userspace part of
+ *   contemporary OpenGL, X, and other drivers is huge, and reworking them to
+ *   use a different way to mmap a buffer rather invasive.
+ *
+ *   The assumption in the current dma-buf interfaces is that redirecting the
+ *   initial mmap is all that's needed. A survey of some of the existing
+ *   subsystems shows that no driver seems to do any nefarious thing like
+ *   syncing up with outstanding asynchronous processing on the device or
+ *   allocating special resources at fault time. So hopefully this is good
+ *   enough, since adding interfaces to intercept pagefaults and allow pte
+ *   shootdowns would increase the complexity quite a bit.
+ *
+ *   Interface::
+ *      int dma_buf_mmap(struct dma_buf \*, struct vm_area_struct \*,
+ *		       unsigned long);
+ *
+ *   If the importing subsystem simply provides a special-purpose mmap call to
+ *   set up a mapping in userspace, calling do_mmap with dma_buf->file will
+ *   equally achieve that for a dma-buf object.
+ */
+
 static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
 				      enum dma_data_direction direction)
 {
@ -611,6 +802,10 @@ static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
 * @dmabuf:	[in]	buffer to prepare cpu access for.
 * @direction:	[in]	length of range for cpu access.
 *
+ * After the cpu access is complete the caller should call
+ * dma_buf_end_cpu_access(). Only when cpu access is braketed by both calls is
+ * it guaranteed to be coherent with other DMA access.
+ *
 * Can return negative error values, returns 0 on success.
 */
 int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
@ -643,6 +838,8 @@ EXPORT_SYMBOL_GPL(dma_buf_begin_cpu_access);
 * @dmabuf:	[in]	buffer to complete cpu access for.
 * @direction:	[in]	length of range for cpu access.
 *
+ * This terminates CPU access started with dma_buf_begin_cpu_access().
+ *
 * Can return negative error values, returns 0 on success.
 */
 int dma_buf_end_cpu_access(struct dma_buf *dmabuf,
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@ -28,6 +28,7 @@

 EXPORT_TRACEPOINT_SYMBOL(dma_fence_annotate_wait_on);
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);

 /*
 * fence context counter: each execution context should have its own
@ -281,6 +282,31 @@ int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb,
 }
 EXPORT_SYMBOL(dma_fence_add_callback);

+/**
+ * dma_fence_get_status - returns the status upon completion
+ * @fence: [in]	the dma_fence to query
+ *
+ * This wraps dma_fence_get_status_locked() to return the error status
+ * condition on a signaled fence. See dma_fence_get_status_locked() for more
+ * details.
+ *
+ * Returns 0 if the fence has not yet been signaled, 1 if the fence has
+ * been signaled without an error condition, or a negative error code
+ * if the fence has been completed in err.
+ */
+int dma_fence_get_status(struct dma_fence *fence)
+{
+	unsigned long flags;
+	int status;
+
+	spin_lock_irqsave(fence->lock, flags);
+	status = dma_fence_get_status_locked(fence);
+	spin_unlock_irqrestore(fence->lock, flags);
+
+	return status;
+}
+EXPORT_SYMBOL(dma_fence_get_status);
+
 /**
 * dma_fence_remove_callback - remove a callback from the signaling list
 * @fence:	[in]	the fence to wait on
@ -541,6 +567,7 @@ dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
 	fence->context = context;
 	fence->seqno = seqno;
 	fence->flags = 0UL;
+	fence->error = 0;

 	trace_dma_fence_init(fence);
 }
--- a/drivers/dma-buf/sync_debug.c
+++ b/drivers/dma-buf/sync_debug.c
@ -62,30 +62,29 @@ void sync_file_debug_remove(struct sync_file *sync_file)

 static const char *sync_status_str(int status)
 {
-	if (status == 0)
-		return "signaled";
+	if (status < 0)
+		return "error";

 	if (status > 0)
-		return "active";
+		return "signaled";

-	return "error";
+	return "active";
 }

 static void sync_print_fence(struct seq_file *s,
 			     struct dma_fence *fence, bool show)
 {
-	int status = 1;
 	struct sync_timeline *parent = dma_fence_parent(fence);
+	int status;

-	if (dma_fence_is_signaled_locked(fence))
-		status = fence->status;
+	status = dma_fence_get_status_locked(fence);

 	seq_printf(s, "  %s%sfence %s",
 		   show ? parent->name : "",
 		   show ? "_" : "",
 		   sync_status_str(status));

-	if (status <= 0) {
+	if (status) {
 		struct timespec64 ts64 =
 			ktime_to_timespec64(fence->timestamp);

@ -136,7 +135,7 @@ static void sync_print_sync_file(struct seq_file *s,
 	int i;

 	seq_printf(s, "[%p] %s: %s\n", sync_file, sync_file->name,
-		   sync_status_str(!dma_fence_is_signaled(sync_file->fence)));
+		   sync_status_str(dma_fence_get_status(sync_file->fence)));

 	if (dma_fence_is_array(sync_file->fence)) {
 		struct dma_fence_array *array = to_dma_fence_array(sync_file->fence);
--- a/drivers/dma-buf/sync_file.c
+++ b/drivers/dma-buf/sync_file.c
@ -67,9 +67,10 @@ static void fence_check_cb_func(struct dma_fence *f, struct dma_fence_cb *cb)
 * sync_file_create() - creates a sync file
 * @fence:	fence to add to the sync_fence
 *
- * Creates a sync_file containg @fence. Once this is called, the sync_file
- * takes ownership of @fence. The sync_file can be released with
- * fput(sync_file->file). Returns the sync_file or NULL in case of error.
+ * Creates a sync_file containg @fence. This function acquires and additional
+ * reference of @fence for the newly-created &sync_file, if it succeeds. The
+ * sync_file can be released with fput(sync_file->file). Returns the
+ * sync_file or NULL in case of error.
 */
 struct sync_file *sync_file_create(struct dma_fence *fence)
 {
@ -90,13 +91,6 @@ struct sync_file *sync_file_create(struct dma_fence *fence)
 }
 EXPORT_SYMBOL(sync_file_create);

-/**
- * sync_file_fdget() - get a sync_file from an fd
- * @fd:		fd referencing a fence
- *
- * Ensures @fd references a valid sync_file, increments the refcount of the
- * backing file. Returns the sync_file or NULL in case of error.
- */
 static struct sync_file *sync_file_fdget(int fd)
 {
 	struct file *file = fget(fd);
@ -379,10 +373,8 @@ static void sync_fill_fence_info(struct dma_fence *fence,
 		sizeof(info->obj_name));
 	strlcpy(info->driver_name, fence->ops->get_driver_name(fence),
 		sizeof(info->driver_name));
-	if (dma_fence_is_signaled(fence))
-		info->status = fence->status >= 0 ? 1 : fence->status;
-	else
-		info->status = 0;
+
+	info->status = dma_fence_get_status(fence);
 	info->timestamp_ns = ktime_to_ns(fence->timestamp);
 }

@ -468,4 +460,3 @@ static const struct file_operations sync_file_fops = {
 	.unlocked_ioctl = sync_file_ioctl,
 	.compat_ioctl = sync_file_ioctl,
 };
-
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@ -6,7 +6,7 @@
 #
 menuconfig DRM
 	tristate "Direct Rendering Manager (XFree86 4.1.0 and higher DRI support)"
-	depends on (AGP || AGP=n) && !EMULATED_CMPXCHG && MMU && HAS_DMA
+	depends on (AGP || AGP=n) && !EMULATED_CMPXCHG && HAS_DMA
 	select HDMI
 	select FB_CMDLINE
 	select I2C
@ -48,6 +48,21 @@ config DRM_DEBUG_MM

 	  If in doubt, say "N".

+config DRM_DEBUG_MM_SELFTEST
+	tristate "kselftests for DRM range manager (struct drm_mm)"
+	depends on DRM
+	depends on DEBUG_KERNEL
+	select PRIME_NUMBERS
+	select DRM_LIB_RANDOM
+	default n
+	help
+	  This option provides a kernel module that can be used to test
+	  the DRM range manager (drm_mm) and its API. This option is not
+	  useful for distributions or general kernels, but only for kernel
+	  developers working on DRM and associated drivers.
+
+	  If in doubt, say "N".
+
 config DRM_KMS_HELPER
 	tristate
 	depends on DRM
@ -98,7 +113,7 @@ config DRM_LOAD_EDID_FIRMWARE

 config DRM_TTM
 	tristate
-	depends on DRM
+	depends on DRM && MMU
 	help
 	  GPU memory management subsystem for devices with multiple
 	  GPU memory types. Will be enabled automatically if a device driver
@ -121,13 +136,17 @@ config DRM_KMS_CMA_HELPER
 	help
 	  Choose this if you need the KMS CMA helper functions

+config DRM_VM
+	bool
+	depends on DRM && MMU
+
 source "drivers/gpu/drm/i2c/Kconfig"

 source "drivers/gpu/drm/arm/Kconfig"

 config DRM_RADEON
 	tristate "ATI Radeon"
-	depends on DRM && PCI
+	depends on DRM && PCI && MMU
 	select FW_LOADER
        select DRM_KMS_HELPER
        select DRM_TTM
@ -147,7 +166,7 @@ source "drivers/gpu/drm/radeon/Kconfig"

 config DRM_AMDGPU
 	tristate "AMD GPU"
-	depends on DRM && PCI
+	depends on DRM && PCI && MMU
 	select FW_LOADER
        select DRM_KMS_HELPER
        select DRM_TTM
@ -244,11 +263,14 @@ source "drivers/gpu/drm/mxsfb/Kconfig"

 source "drivers/gpu/drm/meson/Kconfig"

+source "drivers/gpu/drm/tinydrm/Kconfig"
+
 # Keep legacy drivers last

 menuconfig DRM_LEGACY
 	bool "Enable legacy drivers (DANGEROUS)"
-	depends on DRM
+	depends on DRM && MMU
+	select DRM_VM
 	help
 	  Enable legacy DRI1 drivers. Those drivers expose unsafe and dangerous
 	  APIs to user-space, which can be used to circumvent access
@ -321,3 +343,7 @@ config DRM_SAVAGE
 	  chipset. If M is selected the module will be called savage.

 endif # DRM_LEGACY
+
+config DRM_LIB_RANDOM
+	bool
+	default n
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@ -5,7 +5,7 @@
 drm-y       :=	drm_auth.o drm_bufs.o drm_cache.o \
 		drm_context.o drm_dma.o \
 		drm_fops.o drm_gem.o drm_ioctl.o drm_irq.o \
-		drm_lock.o drm_memory.o drm_drv.o drm_vm.o \
+		drm_lock.o drm_memory.o drm_drv.o \
 		drm_scatter.o drm_pci.o \
 		drm_platform.o drm_sysfs.o drm_hashtab.o drm_mm.o \
 		drm_crtc.o drm_fourcc.o drm_modes.o drm_edid.o \
@ -18,6 +18,8 @@ drm-y       :=	drm_auth.o drm_bufs.o drm_cache.o \
 		drm_plane.o drm_color_mgmt.o drm_print.o \
 		drm_dumb_buffers.o drm_mode_config.o

+drm-$(CONFIG_DRM_LIB_RANDOM) += lib/drm_random.o
+drm-$(CONFIG_DRM_VM) += drm_vm.o
 drm-$(CONFIG_COMPAT) += drm_ioc32.o
 drm-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_gem_cma_helper.o
 drm-$(CONFIG_PCI) += ati_pcigart.o
@ -37,6 +39,7 @@ drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
 drm_kms_helper-$(CONFIG_DRM_DP_AUX_CHARDEV) += drm_dp_aux_dev.o

 obj-$(CONFIG_DRM_KMS_HELPER) += drm_kms_helper.o
+obj-$(CONFIG_DRM_DEBUG_MM_SELFTEST) += selftests/

 CFLAGS_drm_trace_points.o := -I$(src)

@ -91,3 +94,4 @@ obj-$(CONFIG_DRM_ARCPGU)+= arc/
 obj-y			+= hisilicon/
 obj-$(CONFIG_DRM_ZTE)	+= zte/
 obj-$(CONFIG_DRM_MXSFB)	+= mxsfb/
+obj-$(CONFIG_DRM_TINYDRM) += tinydrm/
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@ -24,7 +24,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
 	atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
 	amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
 	amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
-	amdgpu_gtt_mgr.o amdgpu_vram_mgr.o
+	amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o

 # add asic specific block
 amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
@ -34,7 +34,7 @@ amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
 amdgpu-$(CONFIG_DRM_AMDGPU_SI)+= si.o gmc_v6_0.o gfx_v6_0.o si_ih.o si_dma.o dce_v6_0.o si_dpm.o si_smc.o

 amdgpu-y += \
-	vi.o
+	vi.o mxgpu_vi.o

 # add GMC block
 amdgpu-y += \
@ -52,8 +52,7 @@ amdgpu-y += \
 # add SMC block
 amdgpu-y += \
 	amdgpu_dpm.o \
-	amdgpu_powerplay.o \
-	cz_smc.o cz_dpm.o
+	amdgpu_powerplay.o

 # add DCE block
 amdgpu-y += \
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@ -91,7 +91,6 @@ extern int amdgpu_vm_fault_stop;
 extern int amdgpu_vm_debug;
 extern int amdgpu_sched_jobs;
 extern int amdgpu_sched_hw_submission;
-extern int amdgpu_powerplay;
 extern int amdgpu_no_evict;
 extern int amdgpu_direct_gma_size;
 extern unsigned amdgpu_pcie_gen_cap;
@ -184,12 +183,18 @@ enum amdgpu_thermal_irq {
 	AMDGPU_THERMAL_IRQ_LAST
 };

+enum amdgpu_kiq_irq {
+	AMDGPU_CP_KIQ_IRQ_DRIVER0 = 0,
+	AMDGPU_CP_KIQ_IRQ_LAST
+};
+
 int amdgpu_set_clockgating_state(struct amdgpu_device *adev,
 				  enum amd_ip_block_type block_type,
 				  enum amd_clockgating_state state);
 int amdgpu_set_powergating_state(struct amdgpu_device *adev,
 				  enum amd_ip_block_type block_type,
 				  enum amd_powergating_state state);
+void amdgpu_get_clockgating_state(struct amdgpu_device *adev, u32 *flags);
 int amdgpu_wait_for_idle(struct amdgpu_device *adev,
 			 enum amd_ip_block_type block_type);
 bool amdgpu_is_idle(struct amdgpu_device *adev,
@ -352,7 +357,7 @@ struct amdgpu_bo_va_mapping {
 	struct list_head		list;
 	struct interval_tree_node	it;
 	uint64_t			offset;
-	uint32_t			flags;
+	uint64_t			flags;
 };

 /* bo virtual addresses in a specific vm */
@ -776,14 +781,20 @@ struct amdgpu_mec {
 	u32 num_queue;
 };

+struct amdgpu_kiq {
+	u64			eop_gpu_addr;
+	struct amdgpu_bo	*eop_obj;
+	struct amdgpu_ring	ring;
+	struct amdgpu_irq_src	irq;
+};
+
 /*
 * GPU scratch registers structures, functions & helpers
 */
 struct amdgpu_scratch {
 	unsigned		num_reg;
 	uint32_t                reg_base;
-	bool			free[32];
-	uint32_t		reg[32];
+	uint32_t		free_mask;
 };

 /*
@ -851,6 +862,7 @@ struct amdgpu_gfx {
 	struct amdgpu_gca_config	config;
 	struct amdgpu_rlc		rlc;
 	struct amdgpu_mec		mec;
+	struct amdgpu_kiq		kiq;
 	struct amdgpu_scratch		scratch;
 	const struct firmware		*me_fw;	/* ME firmware */
 	uint32_t			me_fw_version;
@ -894,8 +906,8 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 void amdgpu_ib_free(struct amdgpu_device *adev, struct amdgpu_ib *ib,
 		    struct dma_fence *f);
 int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
-		       struct amdgpu_ib *ib, struct dma_fence *last_vm_update,
-		       struct amdgpu_job *job, struct dma_fence **f);
+		       struct amdgpu_ib *ibs, struct amdgpu_job *job,
+		       struct dma_fence **f);
 int amdgpu_ib_pool_init(struct amdgpu_device *adev);
 void amdgpu_ib_pool_fini(struct amdgpu_device *adev);
 int amdgpu_ib_ring_tests(struct amdgpu_device *adev);
@ -938,6 +950,7 @@ struct amdgpu_cs_parser {
 #define AMDGPU_PREAMBLE_IB_PRESENT          (1 << 0) /* bit set means command submit involves a preamble IB */
 #define AMDGPU_PREAMBLE_IB_PRESENT_FIRST    (1 << 1) /* bit set means preamble IB is first presented in belonging context */
 #define AMDGPU_HAVE_CTX_SWITCH              (1 << 2) /* bit set means context switch occured */
+#define AMDGPU_VM_DOMAIN                    (1 << 3) /* bit set means in virtual memory context */

 struct amdgpu_job {
 	struct amd_sched_job    base;
@ -1133,7 +1146,6 @@ int amdgpu_debugfs_fence_init(struct amdgpu_device *adev);

 #if defined(CONFIG_DEBUG_FS)
 int amdgpu_debugfs_init(struct drm_minor *minor);
-void amdgpu_debugfs_cleanup(struct drm_minor *minor);
 #endif

 int amdgpu_debugfs_firmware_init(struct amdgpu_device *adev);
@ -1178,7 +1190,6 @@ struct amdgpu_asic_funcs {
 	bool (*read_disabled_bios)(struct amdgpu_device *adev);
 	bool (*read_bios_from_rom)(struct amdgpu_device *adev,
 				   u8 *bios, u32 length_bytes);
-	void (*detect_hw_virtualization) (struct amdgpu_device *adev);
 	int (*read_register)(struct amdgpu_device *adev, u32 se_num,
 			     u32 sh_num, u32 reg_offset, u32 *value);
 	void (*set_vga_state)(struct amdgpu_device *adev, bool state);
@ -1333,7 +1344,6 @@ struct amdgpu_device {
 	/* BIOS */
 	uint8_t				*bios;
 	uint32_t			bios_size;
-	bool				is_atom_bios;
 	struct amdgpu_bo		*stollen_vga_memory;
 	uint32_t			bios_scratch[AMDGPU_BIOS_NUM_SCRATCH];

@ -1463,7 +1473,7 @@ struct amdgpu_device {
 	/* amdkfd interface */
 	struct kfd_dev          *kfd;

-	struct amdgpu_virtualization virtualization;
+	struct amdgpu_virt	virt;

 	/* link all shadow bo */
 	struct list_head                shadow_list;
@ -1472,6 +1482,9 @@ struct amdgpu_device {
 	spinlock_t			gtt_list_lock;
 	struct list_head                gtt_list;

+	/* record hw reset is performed */
+	bool has_hw_reset;
+
 };

 static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
@ -1576,6 +1589,37 @@ static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
 	ring->count_dw--;
 }

+static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring, void *src, int count_dw)
+{
+	unsigned occupied, chunk1, chunk2;
+	void *dst;
+
+	if (ring->count_dw < count_dw) {
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+	} else {
+		occupied = ring->wptr & ring->ptr_mask;
+		dst = (void *)&ring->ring[occupied];
+		chunk1 = ring->ptr_mask + 1 - occupied;
+		chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
+		chunk2 = count_dw - chunk1;
+		chunk1 <<= 2;
+		chunk2 <<= 2;
+
+		if (chunk1)
+			memcpy(dst, src, chunk1);
+
+		if (chunk2) {
+			src += chunk1;
+			dst = (void *)ring->ring;
+			memcpy(dst, src, chunk2);
+		}
+
+		ring->wptr += count_dw;
+		ring->wptr &= ring->ptr_mask;
+		ring->count_dw -= count_dw;
+	}
+}
+
 static inline struct amdgpu_sdma_instance *
 amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 {
@ -1605,7 +1649,6 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 #define amdgpu_asic_get_gpu_clock_counter(adev) (adev)->asic_funcs->get_gpu_clock_counter((adev))
 #define amdgpu_asic_read_disabled_bios(adev) (adev)->asic_funcs->read_disabled_bios((adev))
 #define amdgpu_asic_read_bios_from_rom(adev, b, l) (adev)->asic_funcs->read_bios_from_rom((adev), (b), (l))
-#define amdgpu_asic_detect_hw_virtualization(adev) (adev)->asic_funcs->detect_hw_virtualization((adev))
 #define amdgpu_asic_read_register(adev, se, sh, offset, v)((adev)->asic_funcs->read_register((adev), (se), (sh), (offset), (v)))
 #define amdgpu_gart_flush_gpu_tlb(adev, vmid) (adev)->gart.gart_funcs->flush_gpu_tlb((adev), (vmid))
 #define amdgpu_gart_set_pte_pde(adev, pt, idx, addr, flags) (adev)->gart.gart_funcs->set_pte_pde((adev), (pt), (idx), (addr), (flags))
@ -1627,6 +1670,8 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 #define amdgpu_ring_emit_hdp_invalidate(r) (r)->funcs->emit_hdp_invalidate((r))
 #define amdgpu_ring_emit_switch_buffer(r) (r)->funcs->emit_switch_buffer((r))
 #define amdgpu_ring_emit_cntxcntl(r, d) (r)->funcs->emit_cntxcntl((r), (d))
+#define amdgpu_ring_emit_rreg(r, d) (r)->funcs->emit_rreg((r), (d))
+#define amdgpu_ring_emit_wreg(r, d, v) (r)->funcs->emit_wreg((r), (d), (v))
 #define amdgpu_ring_pad_ib(r, ib) ((r)->funcs->pad_ib((r), (ib)))
 #define amdgpu_ring_init_cond_exec(r) (r)->funcs->init_cond_exec((r))
 #define amdgpu_ring_patch_cond_exec(r,o) (r)->funcs->patch_cond_exec((r),(o))
@ -1658,13 +1703,14 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 int amdgpu_gpu_reset(struct amdgpu_device *adev);
 bool amdgpu_need_backup(struct amdgpu_device *adev);
 void amdgpu_pci_config_reset(struct amdgpu_device *adev);
-bool amdgpu_card_posted(struct amdgpu_device *adev);
+bool amdgpu_need_post(struct amdgpu_device *adev);
 void amdgpu_update_display_priority(struct amdgpu_device *adev);

 int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data);
 int amdgpu_cs_get_ring(struct amdgpu_device *adev, u32 ip_type,
 		       u32 ip_instance, u32 ring,
 		       struct amdgpu_ring **out_ring);
+void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes);
 void amdgpu_ttm_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
 bool amdgpu_ttm_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
 int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages);
@ -1711,7 +1757,7 @@ extern const struct drm_ioctl_desc amdgpu_ioctls_kms[];
 extern const int amdgpu_max_kms_ioctl;

 int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags);
-int amdgpu_driver_unload_kms(struct drm_device *dev);
+void amdgpu_driver_unload_kms(struct drm_device *dev);
 void amdgpu_driver_lastclose_kms(struct drm_device *dev);
 int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv);
 void amdgpu_driver_postclose_kms(struct drm_device *dev,
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@ -672,12 +672,10 @@ int amdgpu_acpi_init(struct amdgpu_device *adev)

 			if ((enc->devices & (ATOM_DEVICE_LCD_SUPPORT)) &&
 			    enc->enc_priv) {
-				if (adev->is_atom_bios) {
-					struct amdgpu_encoder_atom_dig *dig = enc->enc_priv;
-					if (dig->bl_dev) {
-						atif->encoder_for_bl = enc;
-						break;
-					}
+				struct amdgpu_encoder_atom_dig *dig = enc->enc_priv;
+				if (dig->bl_dev) {
+					atif->encoder_for_bl = enc;
+					break;
 				}
 			}
 		}
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
@ -42,6 +42,51 @@
 #define AMD_IS_VALID_VBIOS(p) ((p)[0] == 0x55 && (p)[1] == 0xAA)
 #define AMD_VBIOS_LENGTH(p) ((p)[2] << 9)

+/* Check if current bios is an ATOM BIOS.
+ * Return true if it is ATOM BIOS. Otherwise, return false.
+ */
+static bool check_atom_bios(uint8_t *bios, size_t size)
+{
+	uint16_t tmp, bios_header_start;
+
+	if (!bios || size < 0x49) {
+		DRM_INFO("vbios mem is null or mem size is wrong\n");
+		return false;
+	}
+
+	if (!AMD_IS_VALID_VBIOS(bios)) {
+		DRM_INFO("BIOS signature incorrect %x %x\n", bios[0], bios[1]);
+		return false;
+	}
+
+	tmp = bios[0x18] | (bios[0x19] << 8);
+	if (bios[tmp + 0x14] != 0x0) {
+		DRM_INFO("Not an x86 BIOS ROM\n");
+		return false;
+	}
+
+	bios_header_start = bios[0x48] | (bios[0x49] << 8);
+	if (!bios_header_start) {
+		DRM_INFO("Can't locate bios header\n");
+		return false;
+	}
+
+	tmp = bios_header_start + 4;
+	if (size < tmp) {
+		DRM_INFO("BIOS header is broken\n");
+		return false;
+	}
+
+	if (!memcmp(bios + tmp, "ATOM", 4) ||
+	    !memcmp(bios + tmp, "MOTA", 4)) {
+		DRM_DEBUG("ATOMBIOS detected\n");
+		return true;
+	}
+
+	return false;
+}
+
+
 /* If you boot an IGP board with a discrete card as the primary,
 * the IGP rom is not accessible via the rom bar as the IGP rom is
 * part of the system bios.  On boot, the system bios puts a
@ -55,7 +100,7 @@ static bool igp_read_bios_from_vram(struct amdgpu_device *adev)
 	resource_size_t size = 256 * 1024; /* ??? */

 	if (!(adev->flags & AMD_IS_APU))
-		if (!amdgpu_card_posted(adev))
+		if (amdgpu_need_post(adev))
 			return false;

 	adev->bios = NULL;
@ -65,10 +110,6 @@ static bool igp_read_bios_from_vram(struct amdgpu_device *adev)
 		return false;
 	}

-	if (size == 0 || !AMD_IS_VALID_VBIOS(bios)) {
-		iounmap(bios);
-		return false;
-	}
 	adev->bios = kmalloc(size, GFP_KERNEL);
 	if (!adev->bios) {
 		iounmap(bios);
@ -77,12 +118,18 @@ static bool igp_read_bios_from_vram(struct amdgpu_device *adev)
 	adev->bios_size = size;
 	memcpy_fromio(adev->bios, bios, size);
 	iounmap(bios);
+
+	if (!check_atom_bios(adev->bios, size)) {
+		kfree(adev->bios);
+		return false;
+	}
+
 	return true;
 }

 bool amdgpu_read_bios(struct amdgpu_device *adev)
 {
-	uint8_t __iomem *bios, val[2];
+	uint8_t __iomem *bios;
 	size_t size;

 	adev->bios = NULL;
@ -92,13 +139,6 @@ bool amdgpu_read_bios(struct amdgpu_device *adev)
 		return false;
 	}

-	val[0] = readb(&bios[0]);
-	val[1] = readb(&bios[1]);
-
-	if (size == 0 || !AMD_IS_VALID_VBIOS(val)) {
-		pci_unmap_rom(adev->pdev, bios);
-		return false;
-	}
 	adev->bios = kzalloc(size, GFP_KERNEL);
 	if (adev->bios == NULL) {
 		pci_unmap_rom(adev->pdev, bios);
@ -107,6 +147,12 @@ bool amdgpu_read_bios(struct amdgpu_device *adev)
 	adev->bios_size = size;
 	memcpy_fromio(adev->bios, bios, size);
 	pci_unmap_rom(adev->pdev, bios);
+
+	if (!check_atom_bios(adev->bios, size)) {
+		kfree(adev->bios);
+		return false;
+	}
+
 	return true;
 }

@ -140,7 +186,14 @@ static bool amdgpu_read_bios_from_rom(struct amdgpu_device *adev)
 	adev->bios_size = len;

 	/* read complete BIOS */
-	return amdgpu_asic_read_bios_from_rom(adev, adev->bios, len);
+	amdgpu_asic_read_bios_from_rom(adev, adev->bios, len);
+
+	if (!check_atom_bios(adev->bios, len)) {
+		kfree(adev->bios);
+		return false;
+	}
+
+	return true;
 }

 static bool amdgpu_read_platform_bios(struct amdgpu_device *adev)
@ -155,13 +208,17 @@ static bool amdgpu_read_platform_bios(struct amdgpu_device *adev)
 		return false;
 	}

-	if (size == 0 || !AMD_IS_VALID_VBIOS(bios)) {
-		return false;
-	}
-	adev->bios = kmemdup(bios, size, GFP_KERNEL);
-	if (adev->bios == NULL) {
+	adev->bios = kzalloc(size, GFP_KERNEL);
+	if (adev->bios == NULL)
+		return false;
+
+	memcpy_fromio(adev->bios, bios, size);
+
+	if (!check_atom_bios(adev->bios, size)) {
+		kfree(adev->bios);
 		return false;
 	}
+
 	adev->bios_size = size;

 	return true;
@ -273,7 +330,7 @@ static bool amdgpu_atrm_get_bios(struct amdgpu_device *adev)
 			break;
 	}

-	if (i == 0 || !AMD_IS_VALID_VBIOS(adev->bios)) {
+	if (!check_atom_bios(adev->bios, size)) {
 		kfree(adev->bios);
 		return false;
 	}
@ -298,53 +355,59 @@ static bool amdgpu_read_disabled_bios(struct amdgpu_device *adev)
 #ifdef CONFIG_ACPI
 static bool amdgpu_acpi_vfct_bios(struct amdgpu_device *adev)
 {
-	bool ret = false;
 	struct acpi_table_header *hdr;
 	acpi_size tbl_size;
 	UEFI_ACPI_VFCT *vfct;
-	GOP_VBIOS_CONTENT *vbios;
-	VFCT_IMAGE_HEADER *vhdr;
+	unsigned offset;

 	if (!ACPI_SUCCESS(acpi_get_table("VFCT", 1, &hdr)))
 		return false;
 	tbl_size = hdr->length;
 	if (tbl_size < sizeof(UEFI_ACPI_VFCT)) {
 		DRM_ERROR("ACPI VFCT table present but broken (too short #1)\n");
-		goto out_unmap;
+		return false;
 	}

 	vfct = (UEFI_ACPI_VFCT *)hdr;
-	if (vfct->VBIOSImageOffset + sizeof(VFCT_IMAGE_HEADER) > tbl_size) {
-		DRM_ERROR("ACPI VFCT table present but broken (too short #2)\n");
-		goto out_unmap;
+	offset = vfct->VBIOSImageOffset;
+
+	while (offset < tbl_size) {
+		GOP_VBIOS_CONTENT *vbios = (GOP_VBIOS_CONTENT *)((char *)hdr + offset);
+		VFCT_IMAGE_HEADER *vhdr = &vbios->VbiosHeader;
+
+		offset += sizeof(VFCT_IMAGE_HEADER);
+		if (offset > tbl_size) {
+			DRM_ERROR("ACPI VFCT image header truncated\n");
+			return false;
+		}
+
+		offset += vhdr->ImageLength;
+		if (offset > tbl_size) {
+			DRM_ERROR("ACPI VFCT image truncated\n");
+			return false;
+		}
+
+		if (vhdr->ImageLength &&
+		    vhdr->PCIBus == adev->pdev->bus->number &&
+		    vhdr->PCIDevice == PCI_SLOT(adev->pdev->devfn) &&
+		    vhdr->PCIFunction == PCI_FUNC(adev->pdev->devfn) &&
+		    vhdr->VendorID == adev->pdev->vendor &&
+		    vhdr->DeviceID == adev->pdev->device) {
+			adev->bios = kmemdup(&vbios->VbiosContent,
+					     vhdr->ImageLength,
+					     GFP_KERNEL);
+
+			if (!check_atom_bios(adev->bios, vhdr->ImageLength)) {
+				kfree(adev->bios);
+				return false;
+			}
+			adev->bios_size = vhdr->ImageLength;
+			return true;
+		}
 	}

-	vbios = (GOP_VBIOS_CONTENT *)((char *)hdr + vfct->VBIOSImageOffset);
-	vhdr = &vbios->VbiosHeader;
-	DRM_INFO("ACPI VFCT contains a BIOS for %02x:%02x.%d %04x:%04x, size %d\n",
-			vhdr->PCIBus, vhdr->PCIDevice, vhdr->PCIFunction,
-			vhdr->VendorID, vhdr->DeviceID, vhdr->ImageLength);
-
-	if (vhdr->PCIBus != adev->pdev->bus->number ||
-	    vhdr->PCIDevice != PCI_SLOT(adev->pdev->devfn) ||
-	    vhdr->PCIFunction != PCI_FUNC(adev->pdev->devfn) ||
-	    vhdr->VendorID != adev->pdev->vendor ||
-	    vhdr->DeviceID != adev->pdev->device) {
-		DRM_INFO("ACPI VFCT table is not for this card\n");
-		goto out_unmap;
-	}
-
-	if (vfct->VBIOSImageOffset + sizeof(VFCT_IMAGE_HEADER) + vhdr->ImageLength > tbl_size) {
-		DRM_ERROR("ACPI VFCT image truncated\n");
-		goto out_unmap;
-	}
-
-	adev->bios = kmemdup(&vbios->VbiosContent, vhdr->ImageLength, GFP_KERNEL);
-	adev->bios_size = vhdr->ImageLength;
-	ret = !!adev->bios;
-
-out_unmap:
-	return ret;
+	DRM_ERROR("ACPI VFCT table present but broken (too short #2)\n");
+	return false;
 }
 #else
 static inline bool amdgpu_acpi_vfct_bios(struct amdgpu_device *adev)
@ -355,57 +418,27 @@ static inline bool amdgpu_acpi_vfct_bios(struct amdgpu_device *adev)

 bool amdgpu_get_bios(struct amdgpu_device *adev)
 {
-	bool r;
-	uint16_t tmp, bios_header_start;
+	if (amdgpu_atrm_get_bios(adev))
+		return true;

-	r = amdgpu_atrm_get_bios(adev);
-	if (!r)
-		r = amdgpu_acpi_vfct_bios(adev);
-	if (!r)
-		r = igp_read_bios_from_vram(adev);
-	if (!r)
-		r = amdgpu_read_bios(adev);
-	if (!r) {
-		r = amdgpu_read_bios_from_rom(adev);
-	}
-	if (!r) {
-		r = amdgpu_read_disabled_bios(adev);
-	}
-	if (!r) {
-		r = amdgpu_read_platform_bios(adev);
-	}
-	if (!r || adev->bios == NULL) {
-		DRM_ERROR("Unable to locate a BIOS ROM\n");
-		adev->bios = NULL;
-		return false;
-	}
-	if (!AMD_IS_VALID_VBIOS(adev->bios)) {
-		printk("BIOS signature incorrect %x %x\n", adev->bios[0], adev->bios[1]);
-		goto free_bios;
-	}
+	if (amdgpu_acpi_vfct_bios(adev))
+		return true;

-	tmp = RBIOS16(0x18);
-	if (RBIOS8(tmp + 0x14) != 0x0) {
-		DRM_INFO("Not an x86 BIOS ROM, not using.\n");
-		goto free_bios;
-	}
+	if (igp_read_bios_from_vram(adev))
+		return true;

-	bios_header_start = RBIOS16(0x48);
-	if (!bios_header_start) {
-		goto free_bios;
-	}
-	tmp = bios_header_start + 4;
-	if (!memcmp(adev->bios + tmp, "ATOM", 4) ||
-	    !memcmp(adev->bios + tmp, "MOTA", 4)) {
-		adev->is_atom_bios = true;
-	} else {
-		adev->is_atom_bios = false;
-	}
+	if (amdgpu_read_bios(adev))
+		return true;

-	DRM_DEBUG("%sBIOS detected\n", adev->is_atom_bios ? "ATOM" : "COM");
-	return true;
-free_bios:
-	kfree(adev->bios);
-	adev->bios = NULL;
+	if (amdgpu_read_bios_from_rom(adev))
+		return true;
+
+	if (amdgpu_read_disabled_bios(adev))
+		return true;
+
+	if (amdgpu_read_platform_bios(adev))
+		return true;
+
+	DRM_ERROR("Unable to locate a BIOS ROM\n");
 	return false;
 }
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
@ -713,6 +713,7 @@ static int amdgpu_cgs_rel_firmware(struct cgs_device *cgs_device, enum cgs_ucode
 	CGS_FUNC_ADEV;
 	if ((CGS_UCODE_ID_SMU == type) || (CGS_UCODE_ID_SMU_SK == type)) {
 		release_firmware(adev->pm.fw);
+		adev->pm.fw = NULL;
 		return 0;
 	}
 	/* cannot release other firmware because they are not created by cgs */
@ -762,6 +763,23 @@ static uint16_t amdgpu_get_firmware_version(struct cgs_device *cgs_device,
 	return fw_version;
 }

+static int amdgpu_cgs_enter_safe_mode(struct cgs_device *cgs_device,
+					bool en)
+{
+	CGS_FUNC_ADEV;
+
+	if (adev->gfx.rlc.funcs->enter_safe_mode == NULL ||
+		adev->gfx.rlc.funcs->exit_safe_mode == NULL)
+		return 0;
+
+	if (en)
+		adev->gfx.rlc.funcs->enter_safe_mode(adev);
+	else
+		adev->gfx.rlc.funcs->exit_safe_mode(adev);
+
+	return 0;
+}
+
 static int amdgpu_cgs_get_firmware_info(struct cgs_device *cgs_device,
 					enum cgs_ucode_id type,
 					struct cgs_firmware_info *info)
@ -808,37 +826,65 @@ static int amdgpu_cgs_get_firmware_info(struct cgs_device *cgs_device,
 		const uint8_t *src;
 		const struct smc_firmware_header_v1_0 *hdr;

+		if (CGS_UCODE_ID_SMU_SK == type)
+			amdgpu_cgs_rel_firmware(cgs_device, CGS_UCODE_ID_SMU);
+
 		if (!adev->pm.fw) {
 			switch (adev->asic_type) {
 			case CHIP_TOPAZ:
 				if (((adev->pdev->device == 0x6900) && (adev->pdev->revision == 0x81)) ||
 				    ((adev->pdev->device == 0x6900) && (adev->pdev->revision == 0x83)) ||
-				    ((adev->pdev->device == 0x6907) && (adev->pdev->revision == 0x87)))
+				    ((adev->pdev->device == 0x6907) && (adev->pdev->revision == 0x87))) {
+					info->is_kicker = true;
 					strcpy(fw_name, "amdgpu/topaz_k_smc.bin");
-				else
+				} else
 					strcpy(fw_name, "amdgpu/topaz_smc.bin");
 				break;
 			case CHIP_TONGA:
 				if (((adev->pdev->device == 0x6939) && (adev->pdev->revision == 0xf1)) ||
-				    ((adev->pdev->device == 0x6938) && (adev->pdev->revision == 0xf1)))
+				    ((adev->pdev->device == 0x6938) && (adev->pdev->revision == 0xf1))) {
+					info->is_kicker = true;
 					strcpy(fw_name, "amdgpu/tonga_k_smc.bin");
-				else
+				} else
 					strcpy(fw_name, "amdgpu/tonga_smc.bin");
 				break;
 			case CHIP_FIJI:
 				strcpy(fw_name, "amdgpu/fiji_smc.bin");
 				break;
 			case CHIP_POLARIS11:
-				if (type == CGS_UCODE_ID_SMU)
-					strcpy(fw_name, "amdgpu/polaris11_smc.bin");
-				else if (type == CGS_UCODE_ID_SMU_SK)
+				if (type == CGS_UCODE_ID_SMU) {
+					if (((adev->pdev->device == 0x67ef) &&
+					     ((adev->pdev->revision == 0xe0) ||
+					      (adev->pdev->revision == 0xe2) ||
+					      (adev->pdev->revision == 0xe5))) ||
+					    ((adev->pdev->device == 0x67ff) &&
+					     ((adev->pdev->revision == 0xcf) ||
+					      (adev->pdev->revision == 0xef) ||
+					      (adev->pdev->revision == 0xff)))) {
+						info->is_kicker = true;
+						strcpy(fw_name, "amdgpu/polaris11_k_smc.bin");
+					} else
+						strcpy(fw_name, "amdgpu/polaris11_smc.bin");
+				} else if (type == CGS_UCODE_ID_SMU_SK) {
 					strcpy(fw_name, "amdgpu/polaris11_smc_sk.bin");
+				}
 				break;
 			case CHIP_POLARIS10:
-				if (type == CGS_UCODE_ID_SMU)
-					strcpy(fw_name, "amdgpu/polaris10_smc.bin");
-				else if (type == CGS_UCODE_ID_SMU_SK)
+				if (type == CGS_UCODE_ID_SMU) {
+					if ((adev->pdev->device == 0x67df) &&
+					    ((adev->pdev->revision == 0xe0) ||
+					     (adev->pdev->revision == 0xe3) ||
+					     (adev->pdev->revision == 0xe4) ||
+					     (adev->pdev->revision == 0xe5) ||
+					     (adev->pdev->revision == 0xe7) ||
+					     (adev->pdev->revision == 0xef))) {
+						info->is_kicker = true;
+						strcpy(fw_name, "amdgpu/polaris10_k_smc.bin");
+					} else
+						strcpy(fw_name, "amdgpu/polaris10_smc.bin");
+				} else if (type == CGS_UCODE_ID_SMU_SK) {
 					strcpy(fw_name, "amdgpu/polaris10_smc_sk.bin");
+				}
 				break;
 			case CHIP_POLARIS12:
 				strcpy(fw_name, "amdgpu/polaris12_smc.bin");
@ -1200,51 +1246,52 @@ static int amdgpu_cgs_call_acpi_method(struct cgs_device *cgs_device,
 }

 static const struct cgs_ops amdgpu_cgs_ops = {
-	amdgpu_cgs_gpu_mem_info,
-	amdgpu_cgs_gmap_kmem,
-	amdgpu_cgs_gunmap_kmem,
-	amdgpu_cgs_alloc_gpu_mem,
-	amdgpu_cgs_free_gpu_mem,
-	amdgpu_cgs_gmap_gpu_mem,
-	amdgpu_cgs_gunmap_gpu_mem,
-	amdgpu_cgs_kmap_gpu_mem,
-	amdgpu_cgs_kunmap_gpu_mem,
-	amdgpu_cgs_read_register,
-	amdgpu_cgs_write_register,
-	amdgpu_cgs_read_ind_register,
-	amdgpu_cgs_write_ind_register,
-	amdgpu_cgs_read_pci_config_byte,
-	amdgpu_cgs_read_pci_config_word,
-	amdgpu_cgs_read_pci_config_dword,
-	amdgpu_cgs_write_pci_config_byte,
-	amdgpu_cgs_write_pci_config_word,
-	amdgpu_cgs_write_pci_config_dword,
-	amdgpu_cgs_get_pci_resource,
-	amdgpu_cgs_atom_get_data_table,
-	amdgpu_cgs_atom_get_cmd_table_revs,
-	amdgpu_cgs_atom_exec_cmd_table,
-	amdgpu_cgs_create_pm_request,
-	amdgpu_cgs_destroy_pm_request,
-	amdgpu_cgs_set_pm_request,
-	amdgpu_cgs_pm_request_clock,
-	amdgpu_cgs_pm_request_engine,
-	amdgpu_cgs_pm_query_clock_limits,
-	amdgpu_cgs_set_camera_voltages,
-	amdgpu_cgs_get_firmware_info,
-	amdgpu_cgs_rel_firmware,
-	amdgpu_cgs_set_powergating_state,
-	amdgpu_cgs_set_clockgating_state,
-	amdgpu_cgs_get_active_displays_info,
-	amdgpu_cgs_notify_dpm_enabled,
-	amdgpu_cgs_call_acpi_method,
-	amdgpu_cgs_query_system_info,
-	amdgpu_cgs_is_virtualization_enabled
+	.gpu_mem_info = amdgpu_cgs_gpu_mem_info,
+	.gmap_kmem = amdgpu_cgs_gmap_kmem,
+	.gunmap_kmem = amdgpu_cgs_gunmap_kmem,
+	.alloc_gpu_mem = amdgpu_cgs_alloc_gpu_mem,
+	.free_gpu_mem = amdgpu_cgs_free_gpu_mem,
+	.gmap_gpu_mem = amdgpu_cgs_gmap_gpu_mem,
+	.gunmap_gpu_mem = amdgpu_cgs_gunmap_gpu_mem,
+	.kmap_gpu_mem = amdgpu_cgs_kmap_gpu_mem,
+	.kunmap_gpu_mem = amdgpu_cgs_kunmap_gpu_mem,
+	.read_register = amdgpu_cgs_read_register,
+	.write_register = amdgpu_cgs_write_register,
+	.read_ind_register = amdgpu_cgs_read_ind_register,
+	.write_ind_register = amdgpu_cgs_write_ind_register,
+	.read_pci_config_byte = amdgpu_cgs_read_pci_config_byte,
+	.read_pci_config_word = amdgpu_cgs_read_pci_config_word,
+	.read_pci_config_dword = amdgpu_cgs_read_pci_config_dword,
+	.write_pci_config_byte = amdgpu_cgs_write_pci_config_byte,
+	.write_pci_config_word = amdgpu_cgs_write_pci_config_word,
+	.write_pci_config_dword = amdgpu_cgs_write_pci_config_dword,
+	.get_pci_resource = amdgpu_cgs_get_pci_resource,
+	.atom_get_data_table = amdgpu_cgs_atom_get_data_table,
+	.atom_get_cmd_table_revs = amdgpu_cgs_atom_get_cmd_table_revs,
+	.atom_exec_cmd_table = amdgpu_cgs_atom_exec_cmd_table,
+	.create_pm_request = amdgpu_cgs_create_pm_request,
+	.destroy_pm_request = amdgpu_cgs_destroy_pm_request,
+	.set_pm_request = amdgpu_cgs_set_pm_request,
+	.pm_request_clock = amdgpu_cgs_pm_request_clock,
+	.pm_request_engine = amdgpu_cgs_pm_request_engine,
+	.pm_query_clock_limits = amdgpu_cgs_pm_query_clock_limits,
+	.set_camera_voltages = amdgpu_cgs_set_camera_voltages,
+	.get_firmware_info = amdgpu_cgs_get_firmware_info,
+	.rel_firmware = amdgpu_cgs_rel_firmware,
+	.set_powergating_state = amdgpu_cgs_set_powergating_state,
+	.set_clockgating_state = amdgpu_cgs_set_clockgating_state,
+	.get_active_displays_info = amdgpu_cgs_get_active_displays_info,
+	.notify_dpm_enabled = amdgpu_cgs_notify_dpm_enabled,
+	.call_acpi_method = amdgpu_cgs_call_acpi_method,
+	.query_system_info = amdgpu_cgs_query_system_info,
+	.is_virtualization_enabled = amdgpu_cgs_is_virtualization_enabled,
+	.enter_safe_mode = amdgpu_cgs_enter_safe_mode,
 };

 static const struct cgs_os_ops amdgpu_cgs_os_ops = {
-	amdgpu_cgs_add_irq_source,
-	amdgpu_cgs_irq_get,
-	amdgpu_cgs_irq_put
+	.add_irq_source = amdgpu_cgs_add_irq_source,
+	.irq_get = amdgpu_cgs_irq_get,
+	.irq_put = amdgpu_cgs_irq_put
 };

 struct cgs_device *amdgpu_cgs_create_device(struct amdgpu_device *adev)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@ -75,10 +75,10 @@ int amdgpu_cs_get_ring(struct amdgpu_device *adev, u32 ip_type,
 		*out_ring = &adev->uvd.ring;
 		break;
 	case AMDGPU_HW_IP_VCE:
-		if (ring < 2){
+		if (ring < adev->vce.num_rings){
 			*out_ring = &adev->vce.ring[ring];
 		} else {
-			DRM_ERROR("only two VCE rings are supported\n");
+			DRM_ERROR("only %d VCE rings are supported\n", adev->vce.num_rings);
 			return -EINVAL;
 		}
 		break;
@ -351,8 +351,7 @@ static u64 amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev)
 * submission. This can result in a debt that can stop buffer migrations
 * temporarily.
 */
-static void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev,
-					 u64 num_bytes)
+void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes)
 {
 	spin_lock(&adev->mm_stats.lock);
 	adev->mm_stats.accum_us -= bytes_to_us(adev, num_bytes);
@ -778,6 +777,20 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p,
 	if (r)
 		return r;

+	if (amdgpu_sriov_vf(adev)) {
+		struct dma_fence *f;
+		bo_va = vm->csa_bo_va;
+		BUG_ON(!bo_va);
+		r = amdgpu_vm_bo_update(adev, bo_va, false);
+		if (r)
+			return r;
+
+		f = bo_va->last_pt_update;
+		r = amdgpu_sync_fence(adev, &p->job->sync, f);
+		if (r)
+			return r;
+	}
+
 	if (p->bo_list) {
 		for (i = 0; i < p->bo_list->num_entries; i++) {
 			struct dma_fence *f;
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@ -94,6 +94,11 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,
 {
 	uint32_t ret;

+	if (amdgpu_sriov_runtime(adev)) {
+		BUG_ON(in_interrupt());
+		return amdgpu_virt_kiq_rreg(adev, reg);
+	}
+
 	if ((reg * 4) < adev->rmmio_size && !always_indirect)
 		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
 	else {
@ -113,6 +118,11 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
 {
 	trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);

+	if (amdgpu_sriov_runtime(adev)) {
+		BUG_ON(in_interrupt());
+		return amdgpu_virt_kiq_wreg(adev, reg, v);
+	}
+
 	if ((reg * 4) < adev->rmmio_size && !always_indirect)
 		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
 	else {
@ -609,25 +619,29 @@ void amdgpu_gtt_location(struct amdgpu_device *adev, struct amdgpu_mc *mc)
 * GPU helpers function.
 */
 /**
- * amdgpu_card_posted - check if the hw has already been initialized
+ * amdgpu_need_post - check if the hw need post or not
 *
 * @adev: amdgpu_device pointer
 *
- * Check if the asic has been initialized (all asics).
- * Used at driver startup.
- * Returns true if initialized or false if not.
+ * Check if the asic has been initialized (all asics) at driver startup
+ * or post is needed if  hw reset is performed.
+ * Returns true if need or false if not.
 */
-bool amdgpu_card_posted(struct amdgpu_device *adev)
+bool amdgpu_need_post(struct amdgpu_device *adev)
 {
 	uint32_t reg;

+	if (adev->has_hw_reset) {
+		adev->has_hw_reset = false;
+		return true;
+	}
 	/* then check MEM_SIZE, in case the crtcs are off */
 	reg = RREG32(mmCONFIG_MEMSIZE);

 	if (reg)
-		return true;
+		return false;

-	return false;
+	return true;

 }

@ -655,7 +669,7 @@ static bool amdgpu_vpost_needed(struct amdgpu_device *adev)
 				return true;
 		}
 	}
-	return !amdgpu_card_posted(adev);
+	return amdgpu_need_post(adev);
 }

 /**
@ -885,7 +899,7 @@ static int amdgpu_atombios_init(struct amdgpu_device *adev)
 		atom_card_info->ioreg_read = cail_ioreg_read;
 		atom_card_info->ioreg_write = cail_ioreg_write;
 	} else {
-		DRM_ERROR("Unable to find PCI I/O BAR; using MMIO for ATOM IIO\n");
+		DRM_INFO("PCI I/O BAR is not found. Using MMIO to access ATOM BIOS\n");
 		atom_card_info->ioreg_read = cail_reg_read;
 		atom_card_info->ioreg_write = cail_reg_write;
 	}
@ -1131,6 +1145,18 @@ int amdgpu_set_powergating_state(struct amdgpu_device *adev,
 	return r;
 }

+void amdgpu_get_clockgating_state(struct amdgpu_device *adev, u32 *flags)
+{
+	int i;
+
+	for (i = 0; i < adev->num_ip_blocks; i++) {
+		if (!adev->ip_blocks[i].status.valid)
+			continue;
+		if (adev->ip_blocks[i].version->funcs->get_clockgating_state)
+			adev->ip_blocks[i].version->funcs->get_clockgating_state((void *)adev, flags);
+	}
+}
+
 int amdgpu_wait_for_idle(struct amdgpu_device *adev,
 			 enum amd_ip_block_type block_type)
 {
@ -1235,7 +1261,8 @@ static void amdgpu_device_enable_virtual_display(struct amdgpu_device *adev)
 		pciaddstr_tmp = pciaddstr;
 		while ((pciaddname_tmp = strsep(&pciaddstr_tmp, ";"))) {
 			pciaddname = strsep(&pciaddname_tmp, ",");
-			if (!strcmp(pci_address_name, pciaddname)) {
+			if (!strcmp("all", pciaddname)
+			    || !strcmp(pci_address_name, pciaddname)) {
 				long num_crtc;
 				int res = -1;

@ -1323,6 +1350,12 @@ static int amdgpu_early_init(struct amdgpu_device *adev)
 		return -EINVAL;
 	}

+	if (amdgpu_sriov_vf(adev)) {
+		r = amdgpu_virt_request_full_gpu(adev, true);
+		if (r)
+			return r;
+	}
+
 	for (i = 0; i < adev->num_ip_blocks; i++) {
 		if ((amdgpu_ip_block_mask & (1 << i)) == 0) {
 			DRM_ERROR("disabled ip block: %d\n", i);
@ -1383,6 +1416,15 @@ static int amdgpu_init(struct amdgpu_device *adev)
 				return r;
 			}
 			adev->ip_blocks[i].status.hw = true;
+
+			/* right after GMC hw init, we create CSA */
+			if (amdgpu_sriov_vf(adev)) {
+				r = amdgpu_allocate_static_csa(adev);
+				if (r) {
+					DRM_ERROR("allocate CSA failed %d\n", r);
+					return r;
+				}
+			}
 		}
 	}

@ -1516,6 +1558,11 @@ static int amdgpu_fini(struct amdgpu_device *adev)
 		adev->ip_blocks[i].status.late_initialized = false;
 	}

+	if (amdgpu_sriov_vf(adev)) {
+		amdgpu_bo_free_kernel(&adev->virt.csa_obj, &adev->virt.csa_vmid0_addr, NULL);
+		amdgpu_virt_release_full_gpu(adev, false);
+	}
+
 	return 0;
 }

@ -1523,6 +1570,9 @@ int amdgpu_suspend(struct amdgpu_device *adev)
 {
 	int i, r;

+	if (amdgpu_sriov_vf(adev))
+		amdgpu_virt_request_full_gpu(adev, false);
+
 	/* ungate SMC block first */
 	r = amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_SMC,
 					 AMD_CG_STATE_UNGATE);
@ -1551,6 +1601,9 @@ int amdgpu_suspend(struct amdgpu_device *adev)
 		}
 	}

+	if (amdgpu_sriov_vf(adev))
+		amdgpu_virt_release_full_gpu(adev, false);
+
 	return 0;
 }

@ -1575,7 +1628,7 @@ static int amdgpu_resume(struct amdgpu_device *adev)
 static void amdgpu_device_detect_sriov_bios(struct amdgpu_device *adev)
 {
 	if (amdgpu_atombios_has_gpu_virtualization_table(adev))
-		adev->virtualization.virtual_caps |= AMDGPU_SRIOV_CAPS_SRIOV_VBIOS;
+		adev->virt.caps |= AMDGPU_SRIOV_CAPS_SRIOV_VBIOS;
 }

 /**
@ -1605,7 +1658,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	adev->pdev = pdev;
 	adev->flags = flags;
 	adev->asic_type = flags & AMD_ASIC_MASK;
-	adev->is_atom_bios = false;
 	adev->usec_timeout = AMDGPU_MAX_USEC_TIMEOUT;
 	adev->mc.gtt_size = 512 * 1024 * 1024;
 	adev->accel_working = false;
@ -1695,7 +1747,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 		}
 	}
 	if (adev->rio_mem == NULL)
-		DRM_ERROR("Unable to find PCI I/O BAR\n");
+		DRM_INFO("PCI I/O BAR is not found.\n");

 	/* early init functions */
 	r = amdgpu_early_init(adev);
@ -1720,12 +1772,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 		r = -EINVAL;
 		goto failed;
 	}
-	/* Must be an ATOMBIOS */
-	if (!adev->is_atom_bios) {
-		dev_err(adev->dev, "Expecting atombios for GPU\n");
-		r = -EINVAL;
-		goto failed;
-	}
+
 	r = amdgpu_atombios_init(adev);
 	if (r) {
 		dev_err(adev->dev, "amdgpu_atombios_init failed\n");
@ -1852,8 +1899,6 @@ failed:
 	return r;
 }

-static void amdgpu_debugfs_remove_files(struct amdgpu_device *adev);
-
 /**
 * amdgpu_device_fini - tear down the driver
 *
@ -1893,7 +1938,6 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 	if (adev->asic_type >= CHIP_BONAIRE)
 		amdgpu_doorbell_fini(adev);
 	amdgpu_debugfs_regs_cleanup(adev);
-	amdgpu_debugfs_remove_files(adev);
 }


@ -2031,7 +2075,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon)
 	amdgpu_atombios_scratch_regs_restore(adev);

 	/* post card */
-	if (!amdgpu_card_posted(adev) || !resume) {
+	if (amdgpu_need_post(adev)) {
 		r = amdgpu_atom_asic_init(adev->mode_info.atom_context);
 		if (r)
 			DRM_ERROR("amdgpu asic init failed\n");
@ -2252,6 +2296,9 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev)
 	int resched;
 	bool need_full_reset;

+	if (amdgpu_sriov_vf(adev))
+		return 0;
+
 	if (!amdgpu_check_soft_reset(adev)) {
 		DRM_INFO("No hardware hang detected. Did some blocks stall?\n");
 		return 0;
@ -2507,19 +2554,6 @@ int amdgpu_debugfs_add_files(struct amdgpu_device *adev,
 	return 0;
 }

-static void amdgpu_debugfs_remove_files(struct amdgpu_device *adev)
-{
-#if defined(CONFIG_DEBUG_FS)
-	unsigned i;
-
-	for (i = 0; i < adev->debugfs_count; i++) {
-		drm_debugfs_remove_files(adev->debugfs[i].files,
-					 adev->debugfs[i].num_files,
-					 adev->ddev->primary);
-	}
-#endif
-}
-
 #if defined(CONFIG_DEBUG_FS)

 static ssize_t amdgpu_debugfs_regs_read(struct file *f, char __user *buf,
@ -2853,7 +2887,7 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct file *f, char __user *buf,
 		return -ENOMEM;

 	/* version, increment each time something is added */
-	config[no_regs++] = 2;
+	config[no_regs++] = 3;
 	config[no_regs++] = adev->gfx.config.max_shader_engines;
 	config[no_regs++] = adev->gfx.config.max_tile_pipes;
 	config[no_regs++] = adev->gfx.config.max_cu_per_sh;
@ -2887,6 +2921,12 @@ static ssize_t amdgpu_debugfs_gca_config_read(struct file *f, char __user *buf,
 	config[no_regs++] = adev->family;
 	config[no_regs++] = adev->external_rev_id;

+	/* rev==3 */
+	config[no_regs++] = adev->pdev->device;
+	config[no_regs++] = adev->pdev->revision;
+	config[no_regs++] = adev->pdev->subsystem_device;
+	config[no_regs++] = adev->pdev->subsystem_vendor;
+
 	while (size && (*pos < no_regs * 4)) {
 		uint32_t value;

@ -3153,10 +3193,6 @@ int amdgpu_debugfs_init(struct drm_minor *minor)
 {
 	return 0;
 }
-
-void amdgpu_debugfs_cleanup(struct drm_minor *minor)
-{
-}
 #else
 static int amdgpu_debugfs_regs_init(struct amdgpu_device *adev)
 {
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@ -138,10 +138,52 @@ static void amdgpu_unpin_work_func(struct work_struct *__work)
 	kfree(work);
 }

-int amdgpu_crtc_page_flip_target(struct drm_crtc *crtc,
-				 struct drm_framebuffer *fb,
-				 struct drm_pending_vblank_event *event,
-				 uint32_t page_flip_flags, uint32_t target)
+
+static void amdgpu_flip_work_cleanup(struct amdgpu_flip_work *work)
+{
+	int i;
+
+	amdgpu_bo_unref(&work->old_abo);
+	dma_fence_put(work->excl);
+	for (i = 0; i < work->shared_count; ++i)
+		dma_fence_put(work->shared[i]);
+	kfree(work->shared);
+	kfree(work);
+}
+
+static void amdgpu_flip_cleanup_unreserve(struct amdgpu_flip_work *work,
+					  struct amdgpu_bo *new_abo)
+{
+	amdgpu_bo_unreserve(new_abo);
+	amdgpu_flip_work_cleanup(work);
+}
+
+static void amdgpu_flip_cleanup_unpin(struct amdgpu_flip_work *work,
+				      struct amdgpu_bo *new_abo)
+{
+	if (unlikely(amdgpu_bo_unpin(new_abo) != 0))
+		DRM_ERROR("failed to unpin new abo in error path\n");
+	amdgpu_flip_cleanup_unreserve(work, new_abo);
+}
+
+void amdgpu_crtc_cleanup_flip_ctx(struct amdgpu_flip_work *work,
+				  struct amdgpu_bo *new_abo)
+{
+	if (unlikely(amdgpu_bo_reserve(new_abo, false) != 0)) {
+		DRM_ERROR("failed to reserve new abo in error path\n");
+		amdgpu_flip_work_cleanup(work);
+		return;
+	}
+	amdgpu_flip_cleanup_unpin(work, new_abo);
+}
+
+int amdgpu_crtc_prepare_flip(struct drm_crtc *crtc,
+			     struct drm_framebuffer *fb,
+			     struct drm_pending_vblank_event *event,
+			     uint32_t page_flip_flags,
+			     uint32_t target,
+			     struct amdgpu_flip_work **work_p,
+			     struct amdgpu_bo **new_abo_p)
 {
 	struct drm_device *dev = crtc->dev;
 	struct amdgpu_device *adev = dev->dev_private;
@ -154,7 +196,7 @@ int amdgpu_crtc_page_flip_target(struct drm_crtc *crtc,
 	unsigned long flags;
 	u64 tiling_flags;
 	u64 base;
-	int i, r;
+	int r;

 	work = kzalloc(sizeof *work, GFP_KERNEL);
 	if (work == NULL)
@ -189,7 +231,6 @@ int amdgpu_crtc_page_flip_target(struct drm_crtc *crtc,

 	r = amdgpu_bo_pin(new_abo, AMDGPU_GEM_DOMAIN_VRAM, &base);
 	if (unlikely(r != 0)) {
-		r = -EINVAL;
 		DRM_ERROR("failed to pin new abo buffer before flip\n");
 		goto unreserve;
 	}
@ -216,41 +257,79 @@ int amdgpu_crtc_page_flip_target(struct drm_crtc *crtc,
 		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
 		r = -EBUSY;
 		goto pflip_cleanup;
+
 	}
-
-	amdgpu_crtc->pflip_status = AMDGPU_FLIP_PENDING;
-	amdgpu_crtc->pflip_works = work;
-
-
-	DRM_DEBUG_DRIVER("crtc:%d[%p], pflip_stat:AMDGPU_FLIP_PENDING, work: %p,\n",
-					 amdgpu_crtc->crtc_id, amdgpu_crtc, work);
-	/* update crtc fb */
-	crtc->primary->fb = fb;
 	spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
-	amdgpu_flip_work_func(&work->flip_work.work);
+
+	*work_p = work;
+	*new_abo_p = new_abo;
+
 	return 0;

 pflip_cleanup:
-	if (unlikely(amdgpu_bo_reserve(new_abo, false) != 0)) {
-		DRM_ERROR("failed to reserve new abo in error path\n");
-		goto cleanup;
-	}
+	amdgpu_crtc_cleanup_flip_ctx(work, new_abo);
+	return r;
+
 unpin:
-	if (unlikely(amdgpu_bo_unpin(new_abo) != 0)) {
-		DRM_ERROR("failed to unpin new abo in error path\n");
-	}
+	amdgpu_flip_cleanup_unpin(work, new_abo);
+	return r;
+
 unreserve:
-	amdgpu_bo_unreserve(new_abo);
+	amdgpu_flip_cleanup_unreserve(work, new_abo);
+	return r;

 cleanup:
-	amdgpu_bo_unref(&work->old_abo);
-	dma_fence_put(work->excl);
-	for (i = 0; i < work->shared_count; ++i)
-		dma_fence_put(work->shared[i]);
-	kfree(work->shared);
-	kfree(work);
-
+	amdgpu_flip_work_cleanup(work);
 	return r;
+
+}
+
+void amdgpu_crtc_submit_flip(struct drm_crtc *crtc,
+			     struct drm_framebuffer *fb,
+			     struct amdgpu_flip_work *work,
+			     struct amdgpu_bo *new_abo)
+{
+	unsigned long flags;
+	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
+
+	spin_lock_irqsave(&crtc->dev->event_lock, flags);
+	amdgpu_crtc->pflip_status = AMDGPU_FLIP_PENDING;
+	amdgpu_crtc->pflip_works = work;
+
+	/* update crtc fb */
+	crtc->primary->fb = fb;
+	spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
+
+	DRM_DEBUG_DRIVER(
+			"crtc:%d[%p], pflip_stat:AMDGPU_FLIP_PENDING, work: %p,\n",
+			amdgpu_crtc->crtc_id, amdgpu_crtc, work);
+
+	amdgpu_flip_work_func(&work->flip_work.work);
+}
+
+int amdgpu_crtc_page_flip_target(struct drm_crtc *crtc,
+				 struct drm_framebuffer *fb,
+				 struct drm_pending_vblank_event *event,
+				 uint32_t page_flip_flags,
+				 uint32_t target)
+{
+	struct amdgpu_bo *new_abo;
+	struct amdgpu_flip_work *work;
+	int r;
+
+	r = amdgpu_crtc_prepare_flip(crtc,
+				     fb,
+				     event,
+				     page_flip_flags,
+				     target,
+				     &work,
+				     &new_abo);
+	if (r)
+		return r;
+
+	amdgpu_crtc_submit_flip(crtc, fb, work, new_abo);
+
+	return 0;
 }

 int amdgpu_crtc_set_config(struct drm_mode_set *set)
@ -508,7 +587,7 @@ amdgpu_framebuffer_init(struct drm_device *dev,
 {
 	int ret;
 	rfb->obj = obj;
-	drm_helper_mode_fill_fb_struct(&rfb->base, mode_cmd);
+	drm_helper_mode_fill_fb_struct(dev, &rfb->base, mode_cmd);
 	ret = drm_framebuffer_init(dev, &rfb->base, &amdgpu_fb_funcs);
 	if (ret) {
 		rfb->obj = NULL;
@ -582,12 +661,10 @@ int amdgpu_modeset_create_props(struct amdgpu_device *adev)
 {
 	int sz;

-	if (adev->is_atom_bios) {
-		adev->mode_info.coherent_mode_property =
-			drm_property_create_range(adev->ddev, 0 , "coherent", 0, 1);
-		if (!adev->mode_info.coherent_mode_property)
-			return -ENOMEM;
-	}
+	adev->mode_info.coherent_mode_property =
+		drm_property_create_range(adev->ddev, 0 , "coherent", 0, 1);
+	if (!adev->mode_info.coherent_mode_property)
+		return -ENOMEM;

 	adev->mode_info.load_detect_property =
 		drm_property_create_range(adev->ddev, 0, "load detection", 0, 1);
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
@ -241,13 +241,6 @@ enum amdgpu_pcie_gen {
 	AMDGPU_PCIE_GEN_INVALID = 0xffff
 };

-enum amdgpu_dpm_forced_level {
-	AMDGPU_DPM_FORCED_LEVEL_AUTO = 0,
-	AMDGPU_DPM_FORCED_LEVEL_LOW = 1,
-	AMDGPU_DPM_FORCED_LEVEL_HIGH = 2,
-	AMDGPU_DPM_FORCED_LEVEL_MANUAL = 3,
-};
-
 struct amdgpu_dpm_funcs {
 	int (*get_temperature)(struct amdgpu_device *adev);
 	int (*pre_set_power_state)(struct amdgpu_device *adev);
@ -258,7 +251,7 @@ struct amdgpu_dpm_funcs {
 	u32 (*get_mclk)(struct amdgpu_device *adev, bool low);
 	void (*print_power_state)(struct amdgpu_device *adev, struct amdgpu_ps *ps);
 	void (*debugfs_print_current_performance_level)(struct amdgpu_device *adev, struct seq_file *m);
-	int (*force_performance_level)(struct amdgpu_device *adev, enum amdgpu_dpm_forced_level level);
+	int (*force_performance_level)(struct amdgpu_device *adev, enum amd_dpm_forced_level level);
 	bool (*vblank_too_short)(struct amdgpu_device *adev);
 	void (*powergate_uvd)(struct amdgpu_device *adev, bool gate);
 	void (*powergate_vce)(struct amdgpu_device *adev, bool gate);
@ -353,9 +346,6 @@ struct amdgpu_dpm_funcs {
 #define amdgpu_dpm_get_current_power_state(adev) \
 	(adev)->powerplay.pp_funcs->get_current_power_state((adev)->powerplay.pp_handle)

-#define amdgpu_dpm_get_performance_level(adev) \
-	(adev)->powerplay.pp_funcs->get_performance_level((adev)->powerplay.pp_handle)
-
 #define amdgpu_dpm_get_pp_num_states(adev, data) \
 	(adev)->powerplay.pp_funcs->get_pp_num_states((adev)->powerplay.pp_handle, data)

@ -393,6 +383,11 @@ struct amdgpu_dpm_funcs {
 	 (adev)->powerplay.pp_funcs->get_vce_clock_state((adev)->powerplay.pp_handle, (i)) : \
 	 (adev)->pm.funcs->get_vce_clock_state((adev), (i)))

+#define amdgpu_dpm_get_performance_level(adev) \
+	((adev)->pp_enabled ?						\
+	(adev)->powerplay.pp_funcs->get_performance_level((adev)->powerplay.pp_handle) : \
+	(adev)->pm.dpm.forced_level)
+
 struct amdgpu_dpm {
 	struct amdgpu_ps        *ps;
 	/* number of valid power states */
@ -440,7 +435,7 @@ struct amdgpu_dpm {
 	/* thermal handling */
 	struct amdgpu_dpm_thermal thermal;
 	/* forced levels */
-	enum amdgpu_dpm_forced_level forced_level;
+	enum amd_dpm_forced_level forced_level;
 };

 struct amdgpu_pm {
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@ -90,7 +90,6 @@ int amdgpu_vram_page_split = 1024;
 int amdgpu_exp_hw_support = 0;
 int amdgpu_sched_jobs = 32;
 int amdgpu_sched_hw_submission = 2;
-int amdgpu_powerplay = -1;
 int amdgpu_no_evict = 0;
 int amdgpu_direct_gma_size = 0;
 unsigned amdgpu_pcie_gen_cap = 0;
@ -179,9 +178,6 @@ module_param_named(sched_jobs, amdgpu_sched_jobs, int, 0444);
 MODULE_PARM_DESC(sched_hw_submission, "the max number of HW submissions (default 2)");
 module_param_named(sched_hw_submission, amdgpu_sched_hw_submission, int, 0444);

-MODULE_PARM_DESC(powerplay, "Powerplay component (1 = enable, 0 = disable, -1 = auto (default))");
-module_param_named(powerplay, amdgpu_powerplay, int, 0444);
-
 MODULE_PARM_DESC(ppfeaturemask, "all power features enabled (default))");
 module_param_named(ppfeaturemask, amdgpu_pp_feature_mask, int, 0444);

@ -686,7 +682,6 @@ static struct drm_driver kms_driver = {
 	    DRIVER_USE_AGP |
 	    DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM |
 	    DRIVER_PRIME | DRIVER_RENDER | DRIVER_MODESET,
-	.dev_priv_size = 0,
 	.load = amdgpu_driver_load_kms,
 	.open = amdgpu_driver_open_kms,
 	.preclose = amdgpu_driver_preclose_kms,
@ -701,7 +696,6 @@ static struct drm_driver kms_driver = {
 	.get_scanout_position = amdgpu_get_crtc_scanoutpos,
 #if defined(CONFIG_DEBUG_FS)
 	.debugfs_init = amdgpu_debugfs_init,
-	.debugfs_cleanup = amdgpu_debugfs_cleanup,
 #endif
 	.irq_preinstall = amdgpu_irq_preinstall,
 	.irq_postinstall = amdgpu_irq_postinstall,
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
@ -245,7 +245,7 @@ static int amdgpufb_create(struct drm_fb_helper *helper,

 	strcpy(info->fix.id, "amdgpudrmfb");

-	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->depth);
+	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);

 	info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
 	info->fbops = &amdgpufb_ops;
@ -272,7 +272,7 @@ static int amdgpufb_create(struct drm_fb_helper *helper,
 	DRM_INFO("fb mappable at 0x%lX\n",  info->fix.smem_start);
 	DRM_INFO("vram apper at 0x%lX\n",  (unsigned long)adev->mc.aper_base);
 	DRM_INFO("size %lu\n", (unsigned long)amdgpu_bo_size(abo));
-	DRM_INFO("fb depth is %d\n", fb->depth);
+	DRM_INFO("fb depth is %d\n", fb->format->depth);
 	DRM_INFO("   pitch is %d\n", fb->pitches[0]);

 	vga_switcheroo_client_fb_set(adev->ddev->pdev, info);
@ -374,7 +374,6 @@ int amdgpu_fbdev_init(struct amdgpu_device *adev)
 			&amdgpu_fb_helper_funcs);

 	ret = drm_fb_helper_init(adev->ddev, &rfbdev->helper,
-				 adev->mode_info.num_crtc,
 				 AMDGPUFB_CONN_LIMIT);
 	if (ret) {
 		kfree(rfbdev);
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@ -471,12 +471,15 @@ out:

 static int amdgpu_gem_va_check(void *param, struct amdgpu_bo *bo)
 {
-	unsigned domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
-
 	/* if anything is swapped out don't swap it in here,
 	   just abort and wait for the next CS */
+	if (!amdgpu_bo_gpu_accessible(bo))
+		return -ERESTARTSYS;

-	return domain == AMDGPU_GEM_DOMAIN_CPU ? -ERESTARTSYS : 0;
+	if (bo->shadow && !amdgpu_bo_gpu_accessible(bo->shadow))
+		return -ERESTARTSYS;
+
+	return 0;
 }

 /**
@ -484,62 +487,44 @@ static int amdgpu_gem_va_check(void *param, struct amdgpu_bo *bo)
 *
 * @adev: amdgpu_device pointer
 * @bo_va: bo_va to update
+ * @list: validation list
+ * @operation: map or unmap
 *
- * Update the bo_va directly after setting it's address. Errors are not
+ * Update the bo_va directly after setting its address. Errors are not
 * vital here, so they are not reported back to userspace.
 */
 static void amdgpu_gem_va_update_vm(struct amdgpu_device *adev,
 				    struct amdgpu_bo_va *bo_va,
+				    struct list_head *list,
 				    uint32_t operation)
 {
-	struct ttm_validate_buffer tv, *entry;
-	struct amdgpu_bo_list_entry vm_pd;
-	struct ww_acquire_ctx ticket;
-	struct list_head list, duplicates;
-	unsigned domain;
-	int r;
+	struct ttm_validate_buffer *entry;
+	int r = -ERESTARTSYS;

-	INIT_LIST_HEAD(&list);
-	INIT_LIST_HEAD(&duplicates);
-
-	tv.bo = &bo_va->bo->tbo;
-	tv.shared = true;
-	list_add(&tv.head, &list);
-
-	amdgpu_vm_get_pd_bo(bo_va->vm, &list, &vm_pd);
-
-	/* Provide duplicates to avoid -EALREADY */
-	r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates);
-	if (r)
-		goto error_print;
-
-	list_for_each_entry(entry, &list, head) {
-		domain = amdgpu_mem_type_to_domain(entry->bo->mem.mem_type);
-		/* if anything is swapped out don't swap it in here,
-		   just abort and wait for the next CS */
-		if (domain == AMDGPU_GEM_DOMAIN_CPU)
-			goto error_unreserve;
+	list_for_each_entry(entry, list, head) {
+		struct amdgpu_bo *bo =
+			container_of(entry->bo, struct amdgpu_bo, tbo);
+		if (amdgpu_gem_va_check(NULL, bo))
+			goto error;
 	}
+
 	r = amdgpu_vm_validate_pt_bos(adev, bo_va->vm, amdgpu_gem_va_check,
 				      NULL);
 	if (r)
-		goto error_unreserve;
+		goto error;

 	r = amdgpu_vm_update_page_directory(adev, bo_va->vm);
 	if (r)
-		goto error_unreserve;
+		goto error;

 	r = amdgpu_vm_clear_freed(adev, bo_va->vm);
 	if (r)
-		goto error_unreserve;
+		goto error;

 	if (operation == AMDGPU_VA_OP_MAP)
 		r = amdgpu_vm_bo_update(adev, bo_va, false);

-error_unreserve:
-	ttm_eu_backoff_reservation(&ticket, &list);
-
-error_print:
+error:
 	if (r && r != -ERESTARTSYS)
 		DRM_ERROR("Couldn't update BO_VA (%d)\n", r);
 }
@ -556,7 +541,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 	struct amdgpu_bo_list_entry vm_pd;
 	struct ttm_validate_buffer tv;
 	struct ww_acquire_ctx ticket;
-	struct list_head list, duplicates;
+	struct list_head list;
 	uint32_t invalid_flags, va_flags = 0;
 	int r = 0;

@ -594,14 +579,13 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 	abo = gem_to_amdgpu_bo(gobj);
 	INIT_LIST_HEAD(&list);
-	INIT_LIST_HEAD(&duplicates);
 	tv.bo = &abo->tbo;
-	tv.shared = true;
+	tv.shared = false;
 	list_add(&tv.head, &list);

 	amdgpu_vm_get_pd_bo(&fpriv->vm, &list, &vm_pd);

-	r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates);
+	r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL);
 	if (r) {
 		drm_gem_object_unreference_unlocked(gobj);
 		return r;
@ -632,10 +616,10 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 	default:
 		break;
 	}
-	ttm_eu_backoff_reservation(&ticket, &list);
 	if (!r && !(args->flags & AMDGPU_VM_DELAY_UPDATE) &&
 	    !amdgpu_vm_debug)
-		amdgpu_gem_va_update_vm(adev, bo_va, args->operation);
+		amdgpu_gem_va_update_vm(adev, bo_va, &list, args->operation);
+	ttm_eu_backoff_reservation(&ticket, &list);

 	drm_gem_object_unreference_unlocked(gobj);
 	return r;
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@ -42,12 +42,12 @@ int amdgpu_gfx_scratch_get(struct amdgpu_device *adev, uint32_t *reg)
 {
 	int i;

-	for (i = 0; i < adev->gfx.scratch.num_reg; i++) {
-		if (adev->gfx.scratch.free[i]) {
-			adev->gfx.scratch.free[i] = false;
-			*reg = adev->gfx.scratch.reg[i];
-			return 0;
-		}
+	i = ffs(adev->gfx.scratch.free_mask);
+	if (i != 0 && i <= adev->gfx.scratch.num_reg) {
+		i--;
+		adev->gfx.scratch.free_mask &= ~(1u << i);
+		*reg = adev->gfx.scratch.reg_base + i;
+		return 0;
 	}
 	return -EINVAL;
 }
@ -62,14 +62,7 @@ int amdgpu_gfx_scratch_get(struct amdgpu_device *adev, uint32_t *reg)
 */
 void amdgpu_gfx_scratch_free(struct amdgpu_device *adev, uint32_t reg)
 {
-	int i;
-
-	for (i = 0; i < adev->gfx.scratch.num_reg; i++) {
-		if (adev->gfx.scratch.reg[i] == reg) {
-			adev->gfx.scratch.free[i] = true;
-			return;
-		}
-	}
+	adev->gfx.scratch.free_mask |= 1u << (reg - adev->gfx.scratch.reg_base);
 }

 /**
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@ -97,8 +97,7 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
 {
 	struct amdgpu_gtt_mgr *mgr = man->priv;
 	struct drm_mm_node *node = mem->mm_node;
-	enum drm_mm_search_flags sflags = DRM_MM_SEARCH_BEST;
-	enum drm_mm_allocator_flags aflags = DRM_MM_CREATE_DEFAULT;
+	enum drm_mm_insert_mode mode;
 	unsigned long fpfn, lpfn;
 	int r;

@ -115,15 +114,14 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
 	else
 		lpfn = man->size;

-	if (place && place->flags & TTM_PL_FLAG_TOPDOWN) {
-		sflags = DRM_MM_SEARCH_BELOW;
-		aflags = DRM_MM_CREATE_TOP;
-	}
+	mode = DRM_MM_INSERT_BEST;
+	if (place && place->flags & TTM_PL_FLAG_TOPDOWN)
+		mode = DRM_MM_INSERT_HIGH;

 	spin_lock(&mgr->lock);
-	r = drm_mm_insert_node_in_range_generic(&mgr->mm, node, mem->num_pages,
-						mem->page_alignment, 0,
-						fpfn, lpfn, sflags, aflags);
+	r = drm_mm_insert_node_in_range(&mgr->mm, node,
+					mem->num_pages, mem->page_alignment, 0,
+					fpfn, lpfn, mode);
 	spin_unlock(&mgr->lock);

 	if (!r) {
@ -235,16 +233,17 @@ static void amdgpu_gtt_mgr_debug(struct ttm_mem_type_manager *man,
 				  const char *prefix)
 {
 	struct amdgpu_gtt_mgr *mgr = man->priv;
+	struct drm_printer p = drm_debug_printer(prefix);

 	spin_lock(&mgr->lock);
-	drm_mm_debug_table(&mgr->mm, prefix);
+	drm_mm_print(&mgr->mm, &p);
 	spin_unlock(&mgr->lock);
 }

 const struct ttm_mem_type_manager_func amdgpu_gtt_mgr_func = {
-	amdgpu_gtt_mgr_init,
-	amdgpu_gtt_mgr_fini,
-	amdgpu_gtt_mgr_new,
-	amdgpu_gtt_mgr_del,
-	amdgpu_gtt_mgr_debug
+	.init = amdgpu_gtt_mgr_init,
+	.takedown = amdgpu_gtt_mgr_fini,
+	.get_node = amdgpu_gtt_mgr_new,
+	.put_node = amdgpu_gtt_mgr_del,
+	.debug = amdgpu_gtt_mgr_debug
 };
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c
@ -231,8 +231,7 @@ void amdgpu_i2c_init(struct amdgpu_device *adev)
 	if (amdgpu_hw_i2c)
 		DRM_INFO("hw_i2c forced on, you may experience display detection problems!\n");

-	if (adev->is_atom_bios)
-		amdgpu_atombios_i2c_init(adev);
+	amdgpu_atombios_i2c_init(adev);
 }

 /* remove all the buses */
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@ -116,8 +116,8 @@ void amdgpu_ib_free(struct amdgpu_device *adev, struct amdgpu_ib *ib,
 * to SI there was just a DE IB.
 */
 int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
-		       struct amdgpu_ib *ibs, struct dma_fence *last_vm_update,
-		       struct amdgpu_job *job, struct dma_fence **f)
+		       struct amdgpu_ib *ibs, struct amdgpu_job *job,
+		       struct dma_fence **f)
 {
 	struct amdgpu_device *adev = ring->adev;
 	struct amdgpu_ib *ib = &ibs[0];
@ -175,15 +175,15 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 	if (ring->funcs->emit_hdp_flush)
 		amdgpu_ring_emit_hdp_flush(ring);

-	/* always set cond_exec_polling to CONTINUE */
-	*ring->cond_exe_cpu_addr = 1;
-
 	skip_preamble = ring->current_ctx == fence_ctx;
 	need_ctx_switch = ring->current_ctx != fence_ctx;
 	if (job && ring->funcs->emit_cntxcntl) {
 		if (need_ctx_switch)
 			status |= AMDGPU_HAVE_CTX_SWITCH;
 		status |= job->preamble_status;
+
+		if (vm)
+			status |= AMDGPU_VM_DOMAIN;
 		amdgpu_ring_emit_cntxcntl(ring, status);
 	}

@ -193,7 +193,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		/* drop preamble IBs if we don't have a context switch */
 		if ((ib->flags & AMDGPU_IB_FLAG_PREAMBLE) &&
 			skip_preamble &&
-			!(status & AMDGPU_PREAMBLE_IB_PRESENT_FIRST))
+			!(status & AMDGPU_PREAMBLE_IB_PRESENT_FIRST) &&
+			!amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */
 			continue;

 		amdgpu_ring_emit_ib(ring, ib, job ? job->vm_id : 0,
@ -223,7 +224,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 		amdgpu_ring_patch_cond_exec(ring, patch_offset);

 	ring->current_ctx = fence_ctx;
-	if (ring->funcs->emit_switch_buffer)
+	if (vm && ring->funcs->emit_switch_buffer)
 		amdgpu_ring_emit_switch_buffer(ring);
 	amdgpu_ring_commit(ring);
 	return 0;
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@ -61,10 +61,8 @@ static void amdgpu_hotplug_work_func(struct work_struct *work)
 	struct drm_connector *connector;

 	mutex_lock(&mode_config->mutex);
-	if (mode_config->num_connector) {
-		list_for_each_entry(connector, &mode_config->connector_list, head)
-			amdgpu_connector_hotplug(connector);
-	}
+	list_for_each_entry(connector, &mode_config->connector_list, head)
+		amdgpu_connector_hotplug(connector);
 	mutex_unlock(&mode_config->mutex);
 	/* Just fire off a uevent and let userspace tell us what to do */
 	drm_helper_hpd_irq_event(dev);
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@ -170,8 +170,7 @@ static struct dma_fence *amdgpu_job_run(struct amd_sched_job *sched_job)
 	BUG_ON(amdgpu_sync_peek_fence(&job->sync, NULL));

 	trace_amdgpu_sched_run_job(job);
-	r = amdgpu_ib_schedule(job->ring, job->num_ibs, job->ibs,
-			       job->sync.last_vm_update, job, &fence);
+	r = amdgpu_ib_schedule(job->ring, job->num_ibs, job->ibs, job, &fence);
 	if (r)
 		DRM_ERROR("Error scheduling IBs (%d)\n", r);

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@ -50,16 +50,19 @@ static inline bool amdgpu_has_atpx(void) { return false; }
 * This is the main unload function for KMS (all asics).
 * Returns 0 on success.
 */
-int amdgpu_driver_unload_kms(struct drm_device *dev)
+void amdgpu_driver_unload_kms(struct drm_device *dev)
 {
 	struct amdgpu_device *adev = dev->dev_private;

 	if (adev == NULL)
-		return 0;
+		return;

 	if (adev->rmmio == NULL)
 		goto done_free;

+	if (amdgpu_sriov_vf(adev))
+		amdgpu_virt_request_full_gpu(adev, false);
+
 	if (amdgpu_device_is_px(dev)) {
 		pm_runtime_get_sync(dev->dev);
 		pm_runtime_forbid(dev->dev);
@ -74,7 +77,6 @@ int amdgpu_driver_unload_kms(struct drm_device *dev)
 done_free:
 	kfree(adev);
 	dev->dev_private = NULL;
-	return 0;
 }

 /**
@ -139,6 +141,9 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags)
 		pm_runtime_put_autosuspend(dev->dev);
 	}

+	if (amdgpu_sriov_vf(adev))
+		amdgpu_virt_release_full_gpu(adev, true);
+
 out:
 	if (r) {
 		/* balance pm_runtime_get_sync in amdgpu_driver_unload_kms */
@ -570,6 +575,27 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 			return -EINVAL;
 		}
 	}
+	case AMDGPU_INFO_NUM_HANDLES: {
+		struct drm_amdgpu_info_num_handles handle;
+
+		switch (info->query_hw_ip.type) {
+		case AMDGPU_HW_IP_UVD:
+			/* Starting Polaris, we support unlimited UVD handles */
+			if (adev->asic_type < CHIP_POLARIS10) {
+				handle.uvd_max_handles = adev->uvd.max_handles;
+				handle.uvd_used_handles = amdgpu_uvd_used_handles(adev);
+
+				return copy_to_user(out, &handle,
+					min((size_t)size, sizeof(handle))) ? -EFAULT : 0;
+			} else {
+				return -ENODATA;
+			}
+
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
 	default:
 		DRM_DEBUG_KMS("Invalid request %d\n", info->query);
 		return -EINVAL;
@ -629,6 +655,12 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
 		goto out_suspend;
 	}

+	if (amdgpu_sriov_vf(adev)) {
+		r = amdgpu_map_static_csa(adev, &fpriv->vm);
+		if (r)
+			goto out_suspend;
+	}
+
 	mutex_init(&fpriv->bo_list_lock);
 	idr_init(&fpriv->bo_list_handles);

@ -667,6 +699,14 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 	amdgpu_uvd_free_handles(adev, file_priv);
 	amdgpu_vce_free_handles(adev, file_priv);

+	if (amdgpu_sriov_vf(adev)) {
+		/* TODO: how to handle reserve failure */
+		BUG_ON(amdgpu_bo_reserve(adev->virt.csa_obj, false));
+		amdgpu_vm_bo_rmv(adev, fpriv->vm.csa_bo_va);
+		fpriv->vm.csa_bo_va = NULL;
+		amdgpu_bo_unreserve(adev->virt.csa_obj);
+	}
+
 	amdgpu_vm_fini(adev, &fpriv->vm);

 	idr_for_each_entry(&fpriv->bo_list_handles, list, handle)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@ -32,6 +32,7 @@

 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_encoder.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_fixed.h>
 #include <drm/drm_crtc_helper.h>
@ -594,6 +595,21 @@ int amdgpu_crtc_page_flip_target(struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
 				 struct drm_pending_vblank_event *event,
 				 uint32_t page_flip_flags, uint32_t target);
+void amdgpu_crtc_cleanup_flip_ctx(struct amdgpu_flip_work *work,
+				  struct amdgpu_bo *new_abo);
+int amdgpu_crtc_prepare_flip(struct drm_crtc *crtc,
+			     struct drm_framebuffer *fb,
+			     struct drm_pending_vblank_event *event,
+			     uint32_t page_flip_flags,
+			     uint32_t target,
+			     struct amdgpu_flip_work **work,
+			     struct amdgpu_bo **new_abo);
+
+void amdgpu_crtc_submit_flip(struct drm_crtc *crtc,
+			     struct drm_framebuffer *fb,
+			     struct amdgpu_flip_work *work,
+			     struct amdgpu_bo *new_abo);
+
 extern const struct drm_mode_config_funcs amdgpu_mode_funcs;

 #endif
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@ -323,6 +323,7 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 	struct amdgpu_bo *bo;
 	enum ttm_bo_type type;
 	unsigned long page_align;
+	u64 initial_bytes_moved;
 	size_t acc_size;
 	int r;

@ -363,11 +364,33 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,

 	bo->flags = flags;

+#ifdef CONFIG_X86_32
+	/* XXX: Write-combined CPU mappings of GTT seem broken on 32-bit
+	 * See https://bugs.freedesktop.org/show_bug.cgi?id=84627
+	 */
+	bo->flags &= ~AMDGPU_GEM_CREATE_CPU_GTT_USWC;
+#elif defined(CONFIG_X86) && !defined(CONFIG_X86_PAT)
+	/* Don't try to enable write-combining when it can't work, or things
+	 * may be slow
+	 * See https://bugs.freedesktop.org/show_bug.cgi?id=88758
+	 */
+
+#ifndef CONFIG_COMPILE_TEST
+#warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \
+	 thanks to write-combining
+#endif
+
+	if (bo->flags & AMDGPU_GEM_CREATE_CPU_GTT_USWC)
+		DRM_INFO_ONCE("Please enable CONFIG_MTRR and CONFIG_X86_PAT for "
+			      "better performance thanks to write-combining\n");
+	bo->flags &= ~AMDGPU_GEM_CREATE_CPU_GTT_USWC;
+#else
 	/* For architectures that don't support WC memory,
 	 * mask out the WC flag from the BO
 	 */
 	if (!drm_arch_can_wc_memory())
 		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_GTT_USWC;
+#endif

 	amdgpu_fill_placement_to_bo(bo, placement);
 	/* Kernel allocation are uninterruptible */
@ -379,12 +402,25 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 		locked = ww_mutex_trylock(&bo->tbo.ttm_resv.lock);
 		WARN_ON(!locked);
 	}
+
+	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
 	r = ttm_bo_init(&adev->mman.bdev, &bo->tbo, size, type,
 			&bo->placement, page_align, !kernel, NULL,
 			acc_size, sg, resv ? resv : &bo->tbo.ttm_resv,
 			&amdgpu_ttm_bo_destroy);
-	if (unlikely(r != 0))
+	amdgpu_cs_report_moved_bytes(adev,
+		atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);
+
+	if (unlikely(r != 0)) {
+		if (!resv)
+			ww_mutex_unlock(&bo->tbo.resv->lock);
 		return r;
+	}
+
+	bo->tbo.priority = ilog2(bo->tbo.num_pages);
+	if (kernel)
+		bo->tbo.priority *= 2;
+	bo->tbo.priority = min(bo->tbo.priority, (unsigned)(TTM_MAX_BO_PRIORITY - 1));

 	if (flags & AMDGPU_GEM_CREATE_VRAM_CLEARED &&
 	    bo->tbo.mem.placement & TTM_PL_FLAG_VRAM) {
@ -408,7 +444,8 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 	return 0;

 fail_unreserve:
-	ww_mutex_unlock(&bo->tbo.resv->lock);
+	if (!resv)
+		ww_mutex_unlock(&bo->tbo.resv->lock);
 	amdgpu_bo_unref(&bo);
 	return r;
 }
@ -472,7 +509,16 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 		return r;

 	if (amdgpu_need_backup(adev) && (flags & AMDGPU_GEM_CREATE_SHADOW)) {
+		if (!resv) {
+			r = ww_mutex_lock(&(*bo_ptr)->tbo.resv->lock, NULL);
+			WARN_ON(r != 0);
+		}
+
 		r = amdgpu_bo_create_shadow(adev, size, byte_align, (*bo_ptr));
+
+		if (!resv)
+			ww_mutex_unlock(&(*bo_ptr)->tbo.resv->lock);
+
 		if (r)
 			amdgpu_bo_unref(bo_ptr);
 	}
@ -849,6 +895,7 @@ int amdgpu_bo_get_metadata(struct amdgpu_bo *bo, void *buffer,
 }

 void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
+			   bool evict,
 			   struct ttm_mem_reg *new_mem)
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
@ -861,6 +908,10 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
 	abo = container_of(bo, struct amdgpu_bo, tbo);
 	amdgpu_vm_bo_invalidate(adev, abo);

+	/* remember the eviction */
+	if (evict)
+		atomic64_inc(&adev->num_evictions);
+
 	/* update statistics */
 	if (!new_mem)
 		return;
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@ -114,6 +114,15 @@ static inline u64 amdgpu_bo_mmap_offset(struct amdgpu_bo *bo)
 	return drm_vma_node_offset_addr(&bo->tbo.vma_node);
 }

+/**
+ * amdgpu_bo_gpu_accessible - return whether the bo is currently in memory that
+ * is accessible to the GPU.
+ */
+static inline bool amdgpu_bo_gpu_accessible(struct amdgpu_bo *bo)
+{
+	return bo->tbo.mem.mem_type != TTM_PL_SYSTEM;
+}
+
 int amdgpu_bo_create(struct amdgpu_device *adev,
 			    unsigned long size, int byte_align,
 			    bool kernel, u32 domain, u64 flags,
@ -155,7 +164,8 @@ int amdgpu_bo_get_metadata(struct amdgpu_bo *bo, void *buffer,
 			   size_t buffer_size, uint32_t *metadata_size,
 			   uint64_t *flags);
 void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
-				  struct ttm_mem_reg *new_mem);
+			   bool evict,
+			   struct ttm_mem_reg *new_mem);
 int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
 void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
 		     bool shared);
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@ -34,6 +34,28 @@

 static int amdgpu_debugfs_pm_init(struct amdgpu_device *adev);

+static const struct cg_flag_name clocks[] = {
+	{AMD_CG_SUPPORT_GFX_MGCG, "Graphics Medium Grain Clock Gating"},
+	{AMD_CG_SUPPORT_GFX_MGLS, "Graphics Medium Grain memory Light Sleep"},
+	{AMD_CG_SUPPORT_GFX_CGCG, "Graphics Coarse Grain Clock Gating"},
+	{AMD_CG_SUPPORT_GFX_CGLS, "Graphics Coarse Grain memory Light Sleep"},
+	{AMD_CG_SUPPORT_GFX_CGTS, "Graphics Coarse Grain Tree Shader Clock Gating"},
+	{AMD_CG_SUPPORT_GFX_CGTS_LS, "Graphics Coarse Grain Tree Shader Light Sleep"},
+	{AMD_CG_SUPPORT_GFX_CP_LS, "Graphics Command Processor Light Sleep"},
+	{AMD_CG_SUPPORT_GFX_RLC_LS, "Graphics Run List Controller Light Sleep"},
+	{AMD_CG_SUPPORT_MC_LS, "Memory Controller Light Sleep"},
+	{AMD_CG_SUPPORT_MC_MGCG, "Memory Controller Medium Grain Clock Gating"},
+	{AMD_CG_SUPPORT_SDMA_LS, "System Direct Memory Access Light Sleep"},
+	{AMD_CG_SUPPORT_SDMA_MGCG, "System Direct Memory Access Medium Grain Clock Gating"},
+	{AMD_CG_SUPPORT_BIF_LS, "Bus Interface Light Sleep"},
+	{AMD_CG_SUPPORT_UVD_MGCG, "Unified Video Decoder Medium Grain Clock Gating"},
+	{AMD_CG_SUPPORT_VCE_MGCG, "Video Compression Engine Medium Grain Clock Gating"},
+	{AMD_CG_SUPPORT_HDP_LS, "Host Data Path Light Sleep"},
+	{AMD_CG_SUPPORT_HDP_MGCG, "Host Data Path Medium Grain Clock Gating"},
+	{AMD_CG_SUPPORT_ROM_MGCG, "Rom Medium Grain Clock Gating"},
+	{0, NULL},
+};
+
 void amdgpu_pm_acpi_event_handler(struct amdgpu_device *adev)
 {
 	if (adev->pp_enabled)
@ -112,28 +134,23 @@ static ssize_t amdgpu_get_dpm_forced_performance_level(struct device *dev,
 {
 	struct drm_device *ddev = dev_get_drvdata(dev);
 	struct amdgpu_device *adev = ddev->dev_private;
+	enum amd_dpm_forced_level level;

 	if  ((adev->flags & AMD_IS_PX) &&
 	     (ddev->switch_power_state != DRM_SWITCH_POWER_ON))
 		return snprintf(buf, PAGE_SIZE, "off\n");

-	if (adev->pp_enabled) {
-		enum amd_dpm_forced_level level;
-
-		level = amdgpu_dpm_get_performance_level(adev);
-		return snprintf(buf, PAGE_SIZE, "%s\n",
-				(level == AMD_DPM_FORCED_LEVEL_AUTO) ? "auto" :
-				(level == AMD_DPM_FORCED_LEVEL_LOW) ? "low" :
-				(level == AMD_DPM_FORCED_LEVEL_HIGH) ? "high" :
-				(level == AMD_DPM_FORCED_LEVEL_MANUAL) ? "manual" : "unknown");
-	} else {
-		enum amdgpu_dpm_forced_level level;
-
-		level = adev->pm.dpm.forced_level;
-		return snprintf(buf, PAGE_SIZE, "%s\n",
-				(level == AMDGPU_DPM_FORCED_LEVEL_AUTO) ? "auto" :
-				(level == AMDGPU_DPM_FORCED_LEVEL_LOW) ? "low" : "high");
-	}
+	level = amdgpu_dpm_get_performance_level(adev);
+	return snprintf(buf, PAGE_SIZE, "%s\n",
+			(level == AMD_DPM_FORCED_LEVEL_AUTO) ? "auto" :
+			(level == AMD_DPM_FORCED_LEVEL_LOW) ? "low" :
+			(level == AMD_DPM_FORCED_LEVEL_HIGH) ? "high" :
+			(level == AMD_DPM_FORCED_LEVEL_MANUAL) ? "manual" :
+			(level == AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD) ? "profile_standard" :
+			(level == AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK) ? "profile_min_sclk" :
+			(level == AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK) ? "profile_min_mclk" :
+			(level == AMD_DPM_FORCED_LEVEL_PROFILE_PEAK) ? "profile_peak" :
+			"unknown");
 }

 static ssize_t amdgpu_set_dpm_forced_performance_level(struct device *dev,
@ -143,7 +160,8 @@ static ssize_t amdgpu_set_dpm_forced_performance_level(struct device *dev,
 {
 	struct drm_device *ddev = dev_get_drvdata(dev);
 	struct amdgpu_device *adev = ddev->dev_private;
-	enum amdgpu_dpm_forced_level level;
+	enum amd_dpm_forced_level level;
+	enum amd_dpm_forced_level current_level;
 	int ret = 0;

 	/* Can't force performance level when the card is off */
@ -151,19 +169,34 @@ static ssize_t amdgpu_set_dpm_forced_performance_level(struct device *dev,
 	     (ddev->switch_power_state != DRM_SWITCH_POWER_ON))
 		return -EINVAL;

+	current_level = amdgpu_dpm_get_performance_level(adev);
+
 	if (strncmp("low", buf, strlen("low")) == 0) {
-		level = AMDGPU_DPM_FORCED_LEVEL_LOW;
+		level = AMD_DPM_FORCED_LEVEL_LOW;
 	} else if (strncmp("high", buf, strlen("high")) == 0) {
-		level = AMDGPU_DPM_FORCED_LEVEL_HIGH;
+		level = AMD_DPM_FORCED_LEVEL_HIGH;
 	} else if (strncmp("auto", buf, strlen("auto")) == 0) {
-		level = AMDGPU_DPM_FORCED_LEVEL_AUTO;
+		level = AMD_DPM_FORCED_LEVEL_AUTO;
 	} else if (strncmp("manual", buf, strlen("manual")) == 0) {
-		level = AMDGPU_DPM_FORCED_LEVEL_MANUAL;
-	} else {
+		level = AMD_DPM_FORCED_LEVEL_MANUAL;
+	} else if (strncmp("profile_exit", buf, strlen("profile_exit")) == 0) {
+		level = AMD_DPM_FORCED_LEVEL_PROFILE_EXIT;
+	} else if (strncmp("profile_standard", buf, strlen("profile_standard")) == 0) {
+		level = AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD;
+	} else if (strncmp("profile_min_sclk", buf, strlen("profile_min_sclk")) == 0) {
+		level = AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK;
+	} else if (strncmp("profile_min_mclk", buf, strlen("profile_min_mclk")) == 0) {
+		level = AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK;
+	} else if (strncmp("profile_peak", buf, strlen("profile_peak")) == 0) {
+		level = AMD_DPM_FORCED_LEVEL_PROFILE_PEAK;
+	}  else {
 		count = -EINVAL;
 		goto fail;
 	}

+	if (current_level == level)
+		return count;
+
 	if (adev->pp_enabled)
 		amdgpu_dpm_force_performance_level(adev, level);
 	else {
@ -180,6 +213,7 @@ static ssize_t amdgpu_set_dpm_forced_performance_level(struct device *dev,
 			adev->pm.dpm.forced_level = level;
 		mutex_unlock(&adev->pm.mutex);
 	}
+
 fail:
 	return count;
 }
@ -1060,9 +1094,9 @@ static void amdgpu_dpm_change_power_state_locked(struct amdgpu_device *adev)

 	if (adev->pm.funcs->force_performance_level) {
 		if (adev->pm.dpm.thermal_active) {
-			enum amdgpu_dpm_forced_level level = adev->pm.dpm.forced_level;
+			enum amd_dpm_forced_level level = adev->pm.dpm.forced_level;
 			/* force low perf level for thermal */
-			amdgpu_dpm_force_performance_level(adev, AMDGPU_DPM_FORCED_LEVEL_LOW);
+			amdgpu_dpm_force_performance_level(adev, AMD_DPM_FORCED_LEVEL_LOW);
 			/* save the user's level */
 			adev->pm.dpm.forced_level = level;
 		} else {
@ -1108,12 +1142,22 @@ void amdgpu_dpm_enable_vce(struct amdgpu_device *adev, bool enable)
 			/* XXX select vce level based on ring/task */
 			adev->pm.dpm.vce_level = AMD_VCE_LEVEL_AC_ALL;
 			mutex_unlock(&adev->pm.mutex);
+			amdgpu_pm_compute_clocks(adev);
+			amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
+							AMD_PG_STATE_UNGATE);
+			amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
+							AMD_CG_STATE_UNGATE);
 		} else {
+			amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
+							AMD_PG_STATE_GATE);
+			amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
+							AMD_CG_STATE_GATE);
 			mutex_lock(&adev->pm.mutex);
 			adev->pm.dpm.vce_active = false;
 			mutex_unlock(&adev->pm.mutex);
+			amdgpu_pm_compute_clocks(adev);
 		}
-		amdgpu_pm_compute_clocks(adev);
+
 	}
 }

@ -1252,7 +1296,8 @@ void amdgpu_pm_compute_clocks(struct amdgpu_device *adev)
 	if (!adev->pm.dpm_enabled)
 		return;

-	amdgpu_display_bandwidth_update(adev);
+	if (adev->mode_info.num_crtc)
+		amdgpu_display_bandwidth_update(adev);

 	for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
 		struct amdgpu_ring *ring = adev->rings[i];
@ -1351,12 +1396,27 @@ static int amdgpu_debugfs_pm_info_pp(struct seq_file *m, struct amdgpu_device *a
 	return 0;
 }

+static void amdgpu_parse_cg_state(struct seq_file *m, u32 flags)
+{
+	int i;
+
+	for (i = 0; clocks[i].flag; i++)
+		seq_printf(m, "\t%s: %s\n", clocks[i].name,
+			   (flags & clocks[i].flag) ? "On" : "Off");
+}
+
 static int amdgpu_debugfs_pm_info(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct amdgpu_device *adev = dev->dev_private;
 	struct drm_device *ddev = adev->ddev;
+	u32 flags = 0;
+
+	amdgpu_get_clockgating_state(adev, &flags);
+	seq_printf(m, "Clock Gating Flags Mask: 0x%x\n", flags);
+	amdgpu_parse_cg_state(m, flags);
+	seq_printf(m, "\n");

 	if (!adev->pm.dpm_enabled) {
 		seq_printf(m, "dpm not enabled\n");
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.h
@ -24,6 +24,12 @@
 #ifndef __AMDGPU_PM_H__
 #define __AMDGPU_PM_H__

+struct cg_flag_name
+{
+	u32 flag;
+	const char *name;
+};
+
 int amdgpu_pm_sysfs_init(struct amdgpu_device *adev);
 void amdgpu_pm_sysfs_fini(struct amdgpu_device *adev);
 void amdgpu_pm_print_power_states(struct amdgpu_device *adev);
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
@ -34,67 +34,34 @@
 #include "cik_dpm.h"
 #include "vi_dpm.h"

-static int amdgpu_powerplay_init(struct amdgpu_device *adev)
+static int amdgpu_create_pp_handle(struct amdgpu_device *adev)
 {
-	int ret = 0;
+	struct amd_pp_init pp_init;
 	struct amd_powerplay *amd_pp;
+	int ret;

 	amd_pp = &(adev->powerplay);
-
-	if (adev->pp_enabled) {
-		struct amd_pp_init *pp_init;
-
-		pp_init = kzalloc(sizeof(struct amd_pp_init), GFP_KERNEL);
-
-		if (pp_init == NULL)
-			return -ENOMEM;
-
-		pp_init->chip_family = adev->family;
-		pp_init->chip_id = adev->asic_type;
-		pp_init->device = amdgpu_cgs_create_device(adev);
-		ret = amd_powerplay_init(pp_init, amd_pp);
-		kfree(pp_init);
-	} else {
-		amd_pp->pp_handle = (void *)adev;
-
-		switch (adev->asic_type) {
-#ifdef CONFIG_DRM_AMDGPU_SI
-		case CHIP_TAHITI:
-		case CHIP_PITCAIRN:
-		case CHIP_VERDE:
-		case CHIP_OLAND:
-		case CHIP_HAINAN:
-			amd_pp->ip_funcs = &si_dpm_ip_funcs;
-		break;
-#endif
-#ifdef CONFIG_DRM_AMDGPU_CIK
-		case CHIP_BONAIRE:
-		case CHIP_HAWAII:
-			amd_pp->ip_funcs = &ci_dpm_ip_funcs;
-			break;
-		case CHIP_KABINI:
-		case CHIP_MULLINS:
-		case CHIP_KAVERI:
-			amd_pp->ip_funcs = &kv_dpm_ip_funcs;
-			break;
-#endif
-		case CHIP_CARRIZO:
-		case CHIP_STONEY:
-			amd_pp->ip_funcs = &cz_dpm_ip_funcs;
-			break;
-		default:
-			ret = -EINVAL;
-			break;
-		}
-	}
-	return ret;
+	pp_init.chip_family = adev->family;
+	pp_init.chip_id = adev->asic_type;
+	pp_init.pm_en = amdgpu_dpm != 0 ? true : false;
+	pp_init.feature_mask = amdgpu_pp_feature_mask;
+	pp_init.device = amdgpu_cgs_create_device(adev);
+	ret = amd_powerplay_create(&pp_init, &(amd_pp->pp_handle));
+	if (ret)
+		return -EINVAL;
+	return 0;
 }

 static int amdgpu_pp_early_init(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+	struct amd_powerplay *amd_pp;
 	int ret = 0;

+	amd_pp = &(adev->powerplay);
+	adev->pp_enabled = false;
+	amd_pp->pp_handle = (void *)adev;
+
 	switch (adev->asic_type) {
 	case CHIP_POLARIS11:
 	case CHIP_POLARIS10:
@ -102,30 +69,48 @@ static int amdgpu_pp_early_init(void *handle)
 	case CHIP_TONGA:
 	case CHIP_FIJI:
 	case CHIP_TOPAZ:
-		adev->pp_enabled = true;
-		break;
 	case CHIP_CARRIZO:
 	case CHIP_STONEY:
-		adev->pp_enabled = (amdgpu_powerplay == 0) ? false : true;
+		adev->pp_enabled = true;
+		if (amdgpu_create_pp_handle(adev))
+			return -EINVAL;
+		amd_pp->ip_funcs = &pp_ip_funcs;
+		amd_pp->pp_funcs = &pp_dpm_funcs;
 		break;
 	/* These chips don't have powerplay implemenations */
+#ifdef CONFIG_DRM_AMDGPU_SI
+	case CHIP_TAHITI:
+	case CHIP_PITCAIRN:
+	case CHIP_VERDE:
+	case CHIP_OLAND:
+	case CHIP_HAINAN:
+		amd_pp->ip_funcs = &si_dpm_ip_funcs;
+	break;
+#endif
+#ifdef CONFIG_DRM_AMDGPU_CIK
 	case CHIP_BONAIRE:
 	case CHIP_HAWAII:
+		amd_pp->ip_funcs = &ci_dpm_ip_funcs;
+		break;
 	case CHIP_KABINI:
 	case CHIP_MULLINS:
 	case CHIP_KAVERI:
+		amd_pp->ip_funcs = &kv_dpm_ip_funcs;
+		break;
+#endif
 	default:
-		adev->pp_enabled = false;
+		ret = -EINVAL;
 		break;
 	}

-	ret = amdgpu_powerplay_init(adev);
-	if (ret)
-		return ret;
-
 	if (adev->powerplay.ip_funcs->early_init)
 		ret = adev->powerplay.ip_funcs->early_init(
 					adev->powerplay.pp_handle);
+
+	if (ret == PP_DPM_DISABLED) {
+		adev->pm.dpm_enabled = false;
+		return 0;
+	}
 	return ret;
 }

@ -185,6 +170,11 @@ static int amdgpu_pp_hw_init(void *handle)
 		ret = adev->powerplay.ip_funcs->hw_init(
 					adev->powerplay.pp_handle);

+	if (ret == PP_DPM_DISABLED) {
+		adev->pm.dpm_enabled = false;
+		return 0;
+	}
+
 	if ((amdgpu_dpm != 0) && !amdgpu_sriov_vf(adev))
 		adev->pm.dpm_enabled = true;

@ -210,14 +200,14 @@ static void amdgpu_pp_late_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;

-	if (adev->pp_enabled) {
-		amdgpu_pm_sysfs_fini(adev);
-		amd_powerplay_fini(adev->powerplay.pp_handle);
-	}
-
 	if (adev->powerplay.ip_funcs->late_fini)
 		adev->powerplay.ip_funcs->late_fini(
 			  adev->powerplay.pp_handle);
+
+	if (adev->pp_enabled && adev->pm.dpm_enabled)
+		amdgpu_pm_sysfs_fini(adev);
+
+	amd_powerplay_destroy(adev->powerplay.pp_handle);
 }

 static int amdgpu_pp_suspend(void *handle)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@ -207,6 +207,8 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
 	}
 	ring->cond_exe_gpu_addr = adev->wb.gpu_addr + (ring->cond_exe_offs * 4);
 	ring->cond_exe_cpu_addr = &adev->wb.wb[ring->cond_exe_offs];
+	/* always set cond_exec_polling to CONTINUE */
+	*ring->cond_exe_cpu_addr = 1;

 	r = amdgpu_fence_driver_start_ring(ring, irq_src, irq_type);
 	if (r) {
@ -307,7 +309,7 @@ static ssize_t amdgpu_debugfs_ring_read(struct file *f, char __user *buf,
 	while (size) {
 		if (*pos >= (ring->ring_size + 12))
 			return result;
-			
+
 		value = ring->ring[(*pos - 12)/4];
 		r = put_user(value, (uint32_t*)buf);
 		if (r)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@ -135,6 +135,8 @@ struct amdgpu_ring_funcs {
 	void (*end_use)(struct amdgpu_ring *ring);
 	void (*emit_switch_buffer) (struct amdgpu_ring *ring);
 	void (*emit_cntxcntl) (struct amdgpu_ring *ring, uint32_t flags);
+	void (*emit_rreg)(struct amdgpu_ring *ring, uint32_t reg);
+	void (*emit_wreg)(struct amdgpu_ring *ring, uint32_t reg, uint32_t val);
 };

 struct amdgpu_ring {
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@ -24,7 +24,7 @@ TRACE_EVENT(amdgpu_mm_rreg,
 			   __entry->reg = reg;
 			   __entry->value = value;
 			   ),
-	    TP_printk("0x%04lx, 0x%04lx, 0x%08lx",
+	    TP_printk("0x%04lx, 0x%08lx, 0x%08lx",
 		      (unsigned long)__entry->did,
 		      (unsigned long)__entry->reg,
 		      (unsigned long)__entry->value)
@ -43,7 +43,7 @@ TRACE_EVENT(amdgpu_mm_wreg,
 			   __entry->reg = reg;
 			   __entry->value = value;
 			   ),
-	    TP_printk("0x%04lx, 0x%04lx, 0x%08lx",
+	    TP_printk("0x%04lx, 0x%08lx, 0x%08lx",
 		      (unsigned long)__entry->did,
 		      (unsigned long)__entry->reg,
 		      (unsigned long)__entry->value)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@ -466,10 +466,6 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo,

 	adev = amdgpu_ttm_adev(bo->bdev);

-	/* remember the eviction */
-	if (evict)
-		atomic64_inc(&adev->num_evictions);
-
 	if (old_mem->mem_type == TTM_PL_SYSTEM && bo->ttm == NULL) {
 		amdgpu_move_null(bo, new_mem);
 		return 0;
@ -533,6 +529,9 @@ static int amdgpu_ttm_io_mem_reserve(struct ttm_bo_device *bdev, struct ttm_mem_
 	case TTM_PL_TT:
 		break;
 	case TTM_PL_VRAM:
+		if (mem->start == AMDGPU_BO_INVALID_OFFSET)
+			return -EINVAL;
+
 		mem->bus.offset = mem->start << PAGE_SHIFT;
 		/* check if it's visible */
 		if ((mem->bus.offset + mem->bus.size) > adev->mc.visible_vram_size)
@ -552,6 +551,8 @@ static int amdgpu_ttm_io_mem_reserve(struct ttm_bo_device *bdev, struct ttm_mem_
 			mem->bus.addr =
 				ioremap_nocache(mem->bus.base + mem->bus.offset,
 						mem->bus.size);
+		if (!mem->bus.addr)
+			return -ENOMEM;

 		/*
 		 * Alpha: Use just the bus offset plus
@ -1052,56 +1053,6 @@ uint32_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
 	return flags;
 }

-static void amdgpu_ttm_lru_removal(struct ttm_buffer_object *tbo)
-{
-	struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
-	unsigned i, j;
-
-	for (i = 0; i < AMDGPU_TTM_LRU_SIZE; ++i) {
-		struct amdgpu_mman_lru *lru = &adev->mman.log2_size[i];
-
-		for (j = 0; j < TTM_NUM_MEM_TYPES; ++j)
-			if (&tbo->lru == lru->lru[j])
-				lru->lru[j] = tbo->lru.prev;
-
-		if (&tbo->swap == lru->swap_lru)
-			lru->swap_lru = tbo->swap.prev;
-	}
-}
-
-static struct amdgpu_mman_lru *amdgpu_ttm_lru(struct ttm_buffer_object *tbo)
-{
-	struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
-	unsigned log2_size = min(ilog2(tbo->num_pages),
-				 AMDGPU_TTM_LRU_SIZE - 1);
-
-	return &adev->mman.log2_size[log2_size];
-}
-
-static struct list_head *amdgpu_ttm_lru_tail(struct ttm_buffer_object *tbo)
-{
-	struct amdgpu_mman_lru *lru = amdgpu_ttm_lru(tbo);
-	struct list_head *res = lru->lru[tbo->mem.mem_type];
-
-	lru->lru[tbo->mem.mem_type] = &tbo->lru;
-	while ((++lru)->lru[tbo->mem.mem_type] == res)
-		lru->lru[tbo->mem.mem_type] = &tbo->lru;
-
-	return res;
-}
-
-static struct list_head *amdgpu_ttm_swap_lru_tail(struct ttm_buffer_object *tbo)
-{
-	struct amdgpu_mman_lru *lru = amdgpu_ttm_lru(tbo);
-	struct list_head *res = lru->swap_lru;
-
-	lru->swap_lru = &tbo->swap;
-	while ((++lru)->swap_lru == res)
-		lru->swap_lru = &tbo->swap;
-
-	return res;
-}
-
 static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
 					    const struct ttm_place *place)
 {
@ -1140,14 +1091,10 @@ static struct ttm_bo_driver amdgpu_bo_driver = {
 	.fault_reserve_notify = &amdgpu_bo_fault_reserve_notify,
 	.io_mem_reserve = &amdgpu_ttm_io_mem_reserve,
 	.io_mem_free = &amdgpu_ttm_io_mem_free,
-	.lru_removal = &amdgpu_ttm_lru_removal,
-	.lru_tail = &amdgpu_ttm_lru_tail,
-	.swap_lru_tail = &amdgpu_ttm_swap_lru_tail,
 };

 int amdgpu_ttm_init(struct amdgpu_device *adev)
 {
-	unsigned i, j;
 	int r;

 	r = amdgpu_ttm_global_init(adev);
@ -1165,19 +1112,6 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 		DRM_ERROR("failed initializing buffer object driver(%d).\n", r);
 		return r;
 	}
-
-	for (i = 0; i < AMDGPU_TTM_LRU_SIZE; ++i) {
-		struct amdgpu_mman_lru *lru = &adev->mman.log2_size[i];
-
-		for (j = 0; j < TTM_NUM_MEM_TYPES; ++j)
-			lru->lru[j] = &adev->mman.bdev.man[j].lru;
-		lru->swap_lru = &adev->mman.bdev.glob->swap_lru;
-	}
-
-	for (j = 0; j < TTM_NUM_MEM_TYPES; ++j)
-		adev->mman.guard.lru[j] = NULL;
-	adev->mman.guard.swap_lru = NULL;
-
 	adev->mman.initialized = true;
 	r = ttm_bo_init_mm(&adev->mman.bdev, TTM_PL_VRAM,
 				adev->mc.real_vram_size >> PAGE_SHIFT);
@ -1365,7 +1299,7 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring,
 	WARN_ON(job->ibs[0].length_dw > num_dw);
 	if (direct_submit) {
 		r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs,
-				       NULL, NULL, fence);
+				       NULL, fence);
 		job->fence = dma_fence_get(*fence);
 		if (r)
 			DRM_ERROR("Error scheduling IBs (%d)\n", r);
@ -1482,18 +1416,18 @@ static int amdgpu_mm_dump_table(struct seq_file *m, void *data)
 	struct drm_device *dev = node->minor->dev;
 	struct amdgpu_device *adev = dev->dev_private;
 	struct drm_mm *mm = (struct drm_mm *)adev->mman.bdev.man[ttm_pl].priv;
-	int ret;
 	struct ttm_bo_global *glob = adev->mman.bdev.glob;
+	struct drm_printer p = drm_seq_file_printer(m);

 	spin_lock(&glob->lru_lock);
-	ret = drm_mm_dump_table(m, mm);
+	drm_mm_print(mm, &p);
 	spin_unlock(&glob->lru_lock);
 	if (ttm_pl == TTM_PL_VRAM)
 		seq_printf(m, "man size:%llu pages, ram usage:%lluMB, vis usage:%lluMB\n",
 			   adev->mman.bdev.man[ttm_pl].size,
 			   (u64)atomic64_read(&adev->vram_usage) >> 20,
 			   (u64)atomic64_read(&adev->vram_vis_usage) >> 20);
-	return ret;
+	return 0;
 }

 static int ttm_pl_vram = TTM_PL_VRAM;
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@ -34,13 +34,6 @@
 #define AMDGPU_PL_FLAG_GWS		(TTM_PL_FLAG_PRIV << 1)
 #define AMDGPU_PL_FLAG_OA		(TTM_PL_FLAG_PRIV << 2)

-#define AMDGPU_TTM_LRU_SIZE	20
-
-struct amdgpu_mman_lru {
-	struct list_head		*lru[TTM_NUM_MEM_TYPES];
-	struct list_head		*swap_lru;
-};
-
 struct amdgpu_mman {
 	struct ttm_bo_global_ref        bo_global_ref;
 	struct drm_global_reference	mem_global_ref;
@ -58,11 +51,6 @@ struct amdgpu_mman {
 	struct amdgpu_ring			*buffer_funcs_ring;
 	/* Scheduler entity for buffer moves */
 	struct amd_sched_entity			entity;
-
-	/* custom LRU management */
-	struct amdgpu_mman_lru			log2_size[AMDGPU_TTM_LRU_SIZE];
-	/* guard for log2_size array, don't add anything in between */
-	struct amdgpu_mman_lru			guard;
 };

 extern const struct ttm_mem_type_manager_func amdgpu_gtt_mgr_func;
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@ -976,7 +976,7 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
 	ib->length_dw = 16;

 	if (direct) {
-		r = amdgpu_ib_schedule(ring, 1, ib, NULL, NULL, &f);
+		r = amdgpu_ib_schedule(ring, 1, ib, NULL, &f);
 		job->fence = dma_fence_get(f);
 		if (r)
 			goto err_free;
@ -1113,6 +1113,11 @@ static void amdgpu_uvd_idle_work_handler(struct work_struct *work)
 			amdgpu_dpm_enable_uvd(adev, false);
 		} else {
 			amdgpu_asic_set_uvd_clocks(adev, 0, 0);
+			/* shutdown the UVD block */
+			amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
+							    AMD_PG_STATE_GATE);
+			amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
+							    AMD_CG_STATE_GATE);
 		}
 	} else {
 		schedule_delayed_work(&adev->uvd.idle_work, UVD_IDLE_TIMEOUT);
@ -1129,6 +1134,10 @@ void amdgpu_uvd_ring_begin_use(struct amdgpu_ring *ring)
 			amdgpu_dpm_enable_uvd(adev, true);
 		} else {
 			amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
+			amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
+							    AMD_CG_STATE_UNGATE);
+			amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
+							    AMD_PG_STATE_UNGATE);
 		}
 	}
 }
@ -1178,3 +1187,28 @@ int amdgpu_uvd_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 error:
 	return r;
 }
+
+/**
+ * amdgpu_uvd_used_handles - returns used UVD handles
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns the number of UVD handles in use
+ */
+uint32_t amdgpu_uvd_used_handles(struct amdgpu_device *adev)
+{
+	unsigned i;
+	uint32_t used_handles = 0;
+
+	for (i = 0; i < adev->uvd.max_handles; ++i) {
+		/*
+		 * Handles can be freed in any order, and not
+		 * necessarily linear. So we need to count
+		 * all non-zero handles.
+		 */
+		if (atomic_read(&adev->uvd.handles[i]))
+			used_handles++;
+	}
+
+	return used_handles;
+}
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
@ -38,5 +38,6 @@ int amdgpu_uvd_ring_parse_cs(struct amdgpu_cs_parser *parser, uint32_t ib_idx);
 void amdgpu_uvd_ring_begin_use(struct amdgpu_ring *ring);
 void amdgpu_uvd_ring_end_use(struct amdgpu_ring *ring);
 int amdgpu_uvd_ring_test_ib(struct amdgpu_ring *ring, long timeout);
+uint32_t amdgpu_uvd_used_handles(struct amdgpu_device *adev);

 #endif
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@ -321,6 +321,10 @@ static void amdgpu_vce_idle_work_handler(struct work_struct *work)
 			amdgpu_dpm_enable_vce(adev, false);
 		} else {
 			amdgpu_asic_set_vce_clocks(adev, 0, 0);
+			amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
+							    AMD_PG_STATE_GATE);
+			amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
+							    AMD_CG_STATE_GATE);
 		}
 	} else {
 		schedule_delayed_work(&adev->vce.idle_work, VCE_IDLE_TIMEOUT);
@ -346,6 +350,11 @@ void amdgpu_vce_ring_begin_use(struct amdgpu_ring *ring)
 			amdgpu_dpm_enable_vce(adev, true);
 		} else {
 			amdgpu_asic_set_vce_clocks(adev, 53300, 40000);
+			amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
+							    AMD_CG_STATE_UNGATE);
+			amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
+							    AMD_PG_STATE_UNGATE);
+
 		}
 	}
 	mutex_unlock(&adev->vce.idle_mutex);
@ -455,7 +464,7 @@ int amdgpu_vce_get_create_msg(struct amdgpu_ring *ring, uint32_t handle,
 	for (i = ib->length_dw; i < ib_size_dw; ++i)
 		ib->ptr[i] = 0x0;

-	r = amdgpu_ib_schedule(ring, 1, ib, NULL, NULL, &f);
+	r = amdgpu_ib_schedule(ring, 1, ib, NULL, &f);
 	job->fence = dma_fence_get(f);
 	if (r)
 		goto err;
@ -518,7 +527,7 @@ int amdgpu_vce_get_destroy_msg(struct amdgpu_ring *ring, uint32_t handle,
 		ib->ptr[i] = 0x0;

 	if (direct) {
-		r = amdgpu_ib_schedule(ring, 1, ib, NULL, NULL, &f);
+		r = amdgpu_ib_schedule(ring, 1, ib, NULL, &f);
 		job->fence = dma_fence_get(f);
 		if (r)
 			goto err;
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@ -0,0 +1,221 @@
+/*
+ * Copyright 2016 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "amdgpu.h"
+
+int amdgpu_allocate_static_csa(struct amdgpu_device *adev)
+{
+	int r;
+	void *ptr;
+
+	r = amdgpu_bo_create_kernel(adev, AMDGPU_CSA_SIZE, PAGE_SIZE,
+				AMDGPU_GEM_DOMAIN_VRAM, &adev->virt.csa_obj,
+				&adev->virt.csa_vmid0_addr, &ptr);
+	if (r)
+		return r;
+
+	memset(ptr, 0, AMDGPU_CSA_SIZE);
+	return 0;
+}
+
+/*
+ * amdgpu_map_static_csa should be called during amdgpu_vm_init
+ * it maps virtual address "AMDGPU_VA_RESERVED_SIZE - AMDGPU_CSA_SIZE"
+ * to this VM, and each command submission of GFX should use this virtual
+ * address within META_DATA init package to support SRIOV gfx preemption.
+ */
+
+int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm)
+{
+	int r;
+	struct amdgpu_bo_va *bo_va;
+	struct ww_acquire_ctx ticket;
+	struct list_head list;
+	struct amdgpu_bo_list_entry pd;
+	struct ttm_validate_buffer csa_tv;
+
+	INIT_LIST_HEAD(&list);
+	INIT_LIST_HEAD(&csa_tv.head);
+	csa_tv.bo = &adev->virt.csa_obj->tbo;
+	csa_tv.shared = true;
+
+	list_add(&csa_tv.head, &list);
+	amdgpu_vm_get_pd_bo(vm, &list, &pd);
+
+	r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL);
+	if (r) {
+		DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
+		return r;
+	}
+
+	bo_va = amdgpu_vm_bo_add(adev, vm, adev->virt.csa_obj);
+	if (!bo_va) {
+		ttm_eu_backoff_reservation(&ticket, &list);
+		DRM_ERROR("failed to create bo_va for static CSA\n");
+		return -ENOMEM;
+	}
+
+	r = amdgpu_vm_bo_map(adev, bo_va, AMDGPU_CSA_VADDR, 0,AMDGPU_CSA_SIZE,
+						AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE |
+						AMDGPU_PTE_EXECUTABLE);
+
+	if (r) {
+		DRM_ERROR("failed to do bo_map on static CSA, err=%d\n", r);
+		amdgpu_vm_bo_rmv(adev, bo_va);
+		ttm_eu_backoff_reservation(&ticket, &list);
+		return r;
+	}
+
+	vm->csa_bo_va = bo_va;
+	ttm_eu_backoff_reservation(&ticket, &list);
+	return 0;
+}
+
+void amdgpu_virt_init_setting(struct amdgpu_device *adev)
+{
+	/* enable virtual display */
+	adev->mode_info.num_crtc = 1;
+	adev->enable_virtual_display = true;
+
+	mutex_init(&adev->virt.lock);
+}
+
+uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg)
+{
+	signed long r;
+	uint32_t val;
+	struct dma_fence *f;
+	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
+	struct amdgpu_ring *ring = &kiq->ring;
+
+	BUG_ON(!ring->funcs->emit_rreg);
+
+	mutex_lock(&adev->virt.lock);
+	amdgpu_ring_alloc(ring, 32);
+	amdgpu_ring_emit_hdp_flush(ring);
+	amdgpu_ring_emit_rreg(ring, reg);
+	amdgpu_ring_emit_hdp_invalidate(ring);
+	amdgpu_fence_emit(ring, &f);
+	amdgpu_ring_commit(ring);
+	mutex_unlock(&adev->virt.lock);
+
+	r = dma_fence_wait(f, false);
+	if (r)
+		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
+	dma_fence_put(f);
+
+	val = adev->wb.wb[adev->virt.reg_val_offs];
+
+	return val;
+}
+
+void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)
+{
+	signed long r;
+	struct dma_fence *f;
+	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
+	struct amdgpu_ring *ring = &kiq->ring;
+
+	BUG_ON(!ring->funcs->emit_wreg);
+
+	mutex_lock(&adev->virt.lock);
+	amdgpu_ring_alloc(ring, 32);
+	amdgpu_ring_emit_hdp_flush(ring);
+	amdgpu_ring_emit_wreg(ring, reg, v);
+	amdgpu_ring_emit_hdp_invalidate(ring);
+	amdgpu_fence_emit(ring, &f);
+	amdgpu_ring_commit(ring);
+	mutex_unlock(&adev->virt.lock);
+
+	r = dma_fence_wait(f, false);
+	if (r)
+		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
+	dma_fence_put(f);
+}
+
+/**
+ * amdgpu_virt_request_full_gpu() - request full gpu access
+ * @amdgpu:	amdgpu device.
+ * @init:	is driver init time.
+ * When start to init/fini driver, first need to request full gpu access.
+ * Return: Zero if request success, otherwise will return error.
+ */
+int amdgpu_virt_request_full_gpu(struct amdgpu_device *adev, bool init)
+{
+	struct amdgpu_virt *virt = &adev->virt;
+	int r;
+
+	if (virt->ops && virt->ops->req_full_gpu) {
+		r = virt->ops->req_full_gpu(adev, init);
+		if (r)
+			return r;
+
+		adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME;
+	}
+
+	return 0;
+}
+
+/**
+ * amdgpu_virt_release_full_gpu() - release full gpu access
+ * @amdgpu:	amdgpu device.
+ * @init:	is driver init time.
+ * When finishing driver init/fini, need to release full gpu access.
+ * Return: Zero if release success, otherwise will returen error.
+ */
+int amdgpu_virt_release_full_gpu(struct amdgpu_device *adev, bool init)
+{
+	struct amdgpu_virt *virt = &adev->virt;
+	int r;
+
+	if (virt->ops && virt->ops->rel_full_gpu) {
+		r = virt->ops->rel_full_gpu(adev, init);
+		if (r)
+			return r;
+
+		adev->virt.caps |= AMDGPU_SRIOV_CAPS_RUNTIME;
+	}
+	return 0;
+}
+
+/**
+ * amdgpu_virt_reset_gpu() - reset gpu
+ * @amdgpu:	amdgpu device.
+ * Send reset command to GPU hypervisor to reset GPU that VM is using
+ * Return: Zero if reset success, otherwise will return error.
+ */
+int amdgpu_virt_reset_gpu(struct amdgpu_device *adev)
+{
+	struct amdgpu_virt *virt = &adev->virt;
+	int r;
+
+	if (virt->ops && virt->ops->reset_gpu) {
+		r = virt->ops->reset_gpu(adev);
+		if (r)
+			return r;
+
+		adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME;
+	}
+
+	return 0;
+}
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@ -28,22 +28,48 @@
 #define AMDGPU_SRIOV_CAPS_ENABLE_IOV   (1 << 1) /* sr-iov is enabled on this GPU */
 #define AMDGPU_SRIOV_CAPS_IS_VF        (1 << 2) /* this GPU is a virtual function */
 #define AMDGPU_PASSTHROUGH_MODE        (1 << 3) /* thw whole GPU is pass through for VM */
-/* GPU virtualization */
-struct amdgpu_virtualization {
-	uint32_t virtual_caps;
+#define AMDGPU_SRIOV_CAPS_RUNTIME      (1 << 4) /* is out of full access mode */
+
+/**
+ * struct amdgpu_virt_ops - amdgpu device virt operations
+ */
+struct amdgpu_virt_ops {
+	int (*req_full_gpu)(struct amdgpu_device *adev, bool init);
+	int (*rel_full_gpu)(struct amdgpu_device *adev, bool init);
+	int (*reset_gpu)(struct amdgpu_device *adev);
 };

+/* GPU virtualization */
+struct amdgpu_virt {
+	uint32_t			caps;
+	struct amdgpu_bo		*csa_obj;
+	uint64_t			csa_vmid0_addr;
+	bool chained_ib_support;
+	uint32_t			reg_val_offs;
+	struct mutex			lock;
+	struct amdgpu_irq_src		ack_irq;
+	struct amdgpu_irq_src		rcv_irq;
+	struct delayed_work		flr_work;
+	const struct amdgpu_virt_ops	*ops;
+};
+
+#define AMDGPU_CSA_SIZE    (8 * 1024)
+#define AMDGPU_CSA_VADDR   (AMDGPU_VA_RESERVED_SIZE - AMDGPU_CSA_SIZE)
+
 #define amdgpu_sriov_enabled(adev) \
-((adev)->virtualization.virtual_caps & AMDGPU_SRIOV_CAPS_ENABLE_IOV)
+((adev)->virt.caps & AMDGPU_SRIOV_CAPS_ENABLE_IOV)

 #define amdgpu_sriov_vf(adev) \
-((adev)->virtualization.virtual_caps & AMDGPU_SRIOV_CAPS_IS_VF)
+((adev)->virt.caps & AMDGPU_SRIOV_CAPS_IS_VF)

 #define amdgpu_sriov_bios(adev) \
-((adev)->virtualization.virtual_caps & AMDGPU_SRIOV_CAPS_SRIOV_VBIOS)
+((adev)->virt.caps & AMDGPU_SRIOV_CAPS_SRIOV_VBIOS)
+
+#define amdgpu_sriov_runtime(adev) \
+((adev)->virt.caps & AMDGPU_SRIOV_CAPS_RUNTIME)

 #define amdgpu_passthrough(adev) \
-((adev)->virtualization.virtual_caps & AMDGPU_PASSTHROUGH_MODE)
+((adev)->virt.caps & AMDGPU_PASSTHROUGH_MODE)

 static inline bool is_virtual_machine(void)
 {
@ -54,4 +80,14 @@ static inline bool is_virtual_machine(void)
 #endif
 }

-#endif
+struct amdgpu_vm;
+int amdgpu_allocate_static_csa(struct amdgpu_device *adev);
+int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm);
+void amdgpu_virt_init_setting(struct amdgpu_device *adev);
+uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg);
+void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v);
+int amdgpu_virt_request_full_gpu(struct amdgpu_device *adev, bool init);
+int amdgpu_virt_release_full_gpu(struct amdgpu_device *adev, bool init);
+int amdgpu_virt_reset_gpu(struct amdgpu_device *adev);
+
+#endif
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@ -1293,7 +1293,7 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct amdgpu_device *adev,
 int amdgpu_vm_bo_map(struct amdgpu_device *adev,
 		     struct amdgpu_bo_va *bo_va,
 		     uint64_t saddr, uint64_t offset,
-		     uint64_t size, uint32_t flags)
+		     uint64_t size, uint64_t flags)
 {
 	struct amdgpu_bo_va_mapping *mapping;
 	struct amdgpu_vm *vm = bo_va->vm;
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@ -111,6 +111,8 @@ struct amdgpu_vm {

 	/* client id */
 	u64                     client_id;
+	/* each VM will map on CSA */
+	struct amdgpu_bo_va *csa_bo_va;
 };

 struct amdgpu_vm_id {
@ -195,7 +197,7 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct amdgpu_device *adev,
 int amdgpu_vm_bo_map(struct amdgpu_device *adev,
 		     struct amdgpu_bo_va *bo_va,
 		     uint64_t addr, uint64_t offset,
-		     uint64_t size, uint32_t flags);
+		     uint64_t size, uint64_t flags);
 int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,
 		       struct amdgpu_bo_va *bo_va,
 		       uint64_t addr);
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@ -97,8 +97,7 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 	struct amdgpu_vram_mgr *mgr = man->priv;
 	struct drm_mm *mm = &mgr->mm;
 	struct drm_mm_node *nodes;
-	enum drm_mm_search_flags sflags = DRM_MM_SEARCH_DEFAULT;
-	enum drm_mm_allocator_flags aflags = DRM_MM_CREATE_DEFAULT;
+	enum drm_mm_insert_mode mode;
 	unsigned long lpfn, num_nodes, pages_per_node, pages_left;
 	unsigned i;
 	int r;
@ -121,10 +120,9 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 	if (!nodes)
 		return -ENOMEM;

-	if (place->flags & TTM_PL_FLAG_TOPDOWN) {
-		sflags = DRM_MM_SEARCH_BELOW;
-		aflags = DRM_MM_CREATE_TOP;
-	}
+	mode = DRM_MM_INSERT_BEST;
+	if (place->flags & TTM_PL_FLAG_TOPDOWN)
+		mode = DRM_MM_INSERT_HIGH;

 	pages_left = mem->num_pages;

@ -135,13 +133,11 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,

 		if (pages == pages_per_node)
 			alignment = pages_per_node;
-		else
-			sflags |= DRM_MM_SEARCH_BEST;

-		r = drm_mm_insert_node_in_range_generic(mm, &nodes[i], pages,
-							alignment, 0,
-							place->fpfn, lpfn,
-							sflags, aflags);
+		r = drm_mm_insert_node_in_range(mm, &nodes[i],
+						pages, alignment, 0,
+						place->fpfn, lpfn,
+						mode);
 		if (unlikely(r))
 			goto error;

@ -207,9 +203,10 @@ static void amdgpu_vram_mgr_debug(struct ttm_mem_type_manager *man,
 				  const char *prefix)
 {
 	struct amdgpu_vram_mgr *mgr = man->priv;
+	struct drm_printer p = drm_debug_printer(prefix);

 	spin_lock(&mgr->lock);
-	drm_mm_debug_table(&mgr->mm, prefix);
+	drm_mm_print(&mgr->mm, &p);
 	spin_unlock(&mgr->lock);
 }

--- a/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
+++ b/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
@ -181,9 +181,6 @@ void amdgpu_atombios_encoder_init_backlight(struct amdgpu_encoder *amdgpu_encode
 	if (!amdgpu_encoder->enc_priv)
 		return;

-	if (!adev->is_atom_bios)
-		return;
-
 	if (!(adev->mode_info.firmware_flags & ATOM_BIOS_INFO_BL_CONTROLLED_BY_GPU))
 		return;

@ -236,9 +233,6 @@ amdgpu_atombios_encoder_fini_backlight(struct amdgpu_encoder *amdgpu_encoder)
 	if (!amdgpu_encoder->enc_priv)
 		return;

-	if (!adev->is_atom_bios)
-		return;
-
 	if (!(adev->mode_info.firmware_flags & ATOM_BIOS_INFO_BL_CONTROLLED_BY_GPU))
 		return;

--- a/drivers/gpu/drm/amd/amdgpu/ci_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/ci_dpm.c
@ -889,7 +889,16 @@ static void ci_dpm_powergate_uvd(struct amdgpu_device *adev, bool gate)

 	pi->uvd_power_gated = gate;

-	ci_update_uvd_dpm(adev, gate);
+	if (gate) {
+		/* stop the UVD block */
+		amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
+							AMD_PG_STATE_GATE);
+		ci_update_uvd_dpm(adev, gate);
+	} else {
+		amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
+							AMD_PG_STATE_UNGATE);
+		ci_update_uvd_dpm(adev, gate);
+	}
 }

 static bool ci_dpm_vblank_too_short(struct amdgpu_device *adev)
@ -2201,7 +2210,6 @@ static void ci_clear_vc(struct amdgpu_device *adev)

 static int ci_upload_firmware(struct amdgpu_device *adev)
 {
-	struct ci_power_info *pi = ci_get_pi(adev);
 	int i, ret;

 	if (amdgpu_ci_is_smc_running(adev)) {
@ -2218,7 +2226,7 @@ static int ci_upload_firmware(struct amdgpu_device *adev)
 	amdgpu_ci_stop_smc_clock(adev);
 	amdgpu_ci_reset_smc(adev);

-	ret = amdgpu_ci_load_smc_ucode(adev, pi->sram_end);
+	ret = amdgpu_ci_load_smc_ucode(adev, SMC_RAM_END);

 	return ret;

@ -4248,12 +4256,6 @@ static int ci_update_vce_dpm(struct amdgpu_device *adev,

 	if (amdgpu_current_state->evclk != amdgpu_new_state->evclk) {
 		if (amdgpu_new_state->evclk) {
-			/* turn the clocks on when encoding */
-			ret = amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
-							    AMD_CG_STATE_UNGATE);
-			if (ret)
-				return ret;
-
 			pi->smc_state_table.VceBootLevel = ci_get_vce_boot_level(adev);
 			tmp = RREG32_SMC(ixDPM_TABLE_475);
 			tmp &= ~DPM_TABLE_475__VceBootLevel_MASK;
@ -4265,9 +4267,6 @@ static int ci_update_vce_dpm(struct amdgpu_device *adev,
 			ret = ci_enable_vce_dpm(adev, false);
 			if (ret)
 				return ret;
-			/* turn the clocks off when not encoding */
-			ret = amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
-							    AMD_CG_STATE_GATE);
 		}
 	}
 	return ret;
@ -4336,13 +4335,13 @@ static u32 ci_get_lowest_enabled_level(struct amdgpu_device *adev,


 static int ci_dpm_force_performance_level(struct amdgpu_device *adev,
-					  enum amdgpu_dpm_forced_level level)
+					  enum amd_dpm_forced_level level)
 {
 	struct ci_power_info *pi = ci_get_pi(adev);
 	u32 tmp, levels, i;
 	int ret;

-	if (level == AMDGPU_DPM_FORCED_LEVEL_HIGH) {
+	if (level == AMD_DPM_FORCED_LEVEL_HIGH) {
 		if ((!pi->pcie_dpm_key_disabled) &&
 		    pi->dpm_level_enable_mask.pcie_dpm_enable_mask) {
 			levels = 0;
@ -4403,7 +4402,7 @@ static int ci_dpm_force_performance_level(struct amdgpu_device *adev,
 				}
 			}
 		}
-	} else if (level == AMDGPU_DPM_FORCED_LEVEL_LOW) {
+	} else if (level == AMD_DPM_FORCED_LEVEL_LOW) {
 		if ((!pi->sclk_dpm_key_disabled) &&
 		    pi->dpm_level_enable_mask.sclk_dpm_enable_mask) {
 			levels = ci_get_lowest_enabled_level(adev,
@ -4452,7 +4451,7 @@ static int ci_dpm_force_performance_level(struct amdgpu_device *adev,
 				udelay(1);
 			}
 		}
-	} else if (level == AMDGPU_DPM_FORCED_LEVEL_AUTO) {
+	} else if (level == AMD_DPM_FORCED_LEVEL_AUTO) {
 		if (!pi->pcie_dpm_key_disabled) {
 			PPSMC_Result smc_result;

@ -6262,20 +6261,20 @@ static int ci_dpm_sw_init(void *handle)
 	/* default to balanced state */
 	adev->pm.dpm.state = POWER_STATE_TYPE_BALANCED;
 	adev->pm.dpm.user_state = POWER_STATE_TYPE_BALANCED;
-	adev->pm.dpm.forced_level = AMDGPU_DPM_FORCED_LEVEL_AUTO;
+	adev->pm.dpm.forced_level = AMD_DPM_FORCED_LEVEL_AUTO;
 	adev->pm.default_sclk = adev->clock.default_sclk;
 	adev->pm.default_mclk = adev->clock.default_mclk;
 	adev->pm.current_sclk = adev->clock.default_sclk;
 	adev->pm.current_mclk = adev->clock.default_mclk;
 	adev->pm.int_thermal_type = THERMAL_TYPE_NONE;

-	if (amdgpu_dpm == 0)
-		return 0;
-
 	ret = ci_dpm_init_microcode(adev);
 	if (ret)
 		return ret;

+	if (amdgpu_dpm == 0)
+		return 0;
+
 	INIT_WORK(&adev->pm.dpm.thermal.work, amdgpu_dpm_thermal_work_handler);
 	mutex_lock(&adev->pm.mutex);
 	ret = ci_dpm_init(adev);
@ -6319,8 +6318,15 @@ static int ci_dpm_hw_init(void *handle)

 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;

-	if (!amdgpu_dpm)
+	if (!amdgpu_dpm) {
+		ret = ci_upload_firmware(adev);
+		if (ret) {
+			DRM_ERROR("ci_upload_firmware failed\n");
+			return ret;
+		}
+		ci_dpm_start_smc(adev);
 		return 0;
+	}

 	mutex_lock(&adev->pm.mutex);
 	ci_dpm_setup_asic(adev);
@ -6342,6 +6348,8 @@ static int ci_dpm_hw_fini(void *handle)
 		mutex_lock(&adev->pm.mutex);
 		ci_dpm_disable(adev);
 		mutex_unlock(&adev->pm.mutex);
+	} else {
+		ci_dpm_stop_smc(adev);
 	}

 	return 0;
@ -6571,8 +6579,9 @@ static int ci_dpm_force_clock_level(struct amdgpu_device *adev,
 {
 	struct ci_power_info *pi = ci_get_pi(adev);

-	if (adev->pm.dpm.forced_level
-			!= AMDGPU_DPM_FORCED_LEVEL_MANUAL)
+	if (adev->pm.dpm.forced_level & (AMD_DPM_FORCED_LEVEL_AUTO |
+				AMD_DPM_FORCED_LEVEL_LOW |
+				AMD_DPM_FORCED_LEVEL_HIGH))
 		return -EINVAL;

 	switch (type) {
@ -6739,12 +6748,3 @@ static void ci_dpm_set_irq_funcs(struct amdgpu_device *adev)
 	adev->pm.dpm.thermal.irq.num_types = AMDGPU_THERMAL_IRQ_LAST;
 	adev->pm.dpm.thermal.irq.funcs = &ci_dpm_irq_funcs;
 }
-
-const struct amdgpu_ip_block_version ci_dpm_ip_block =
-{
-	.type = AMD_IP_BLOCK_TYPE_SMC,
-	.major = 7,
-	.minor = 0,
-	.rev = 0,
-	.funcs = &ci_dpm_ip_funcs,
-};
--- a/drivers/gpu/drm/amd/amdgpu/cik.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik.c
@ -1176,6 +1176,7 @@ static int cik_gpu_pci_config_reset(struct amdgpu_device *adev)
 		if (RREG32(mmCONFIG_MEMSIZE) != 0xffffffff) {
 			/* enable BM */
 			pci_set_master(adev->pdev);
+			adev->has_hw_reset = true;
 			r = 0;
 			break;
 		}
@ -1627,14 +1628,13 @@ static uint32_t cik_get_rev_id(struct amdgpu_device *adev)
 static void cik_detect_hw_virtualization(struct amdgpu_device *adev)
 {
 	if (is_virtual_machine()) /* passthrough mode */
-		adev->virtualization.virtual_caps |= AMDGPU_PASSTHROUGH_MODE;
+		adev->virt.caps |= AMDGPU_PASSTHROUGH_MODE;
 }

 static const struct amdgpu_asic_funcs cik_asic_funcs =
 {
 	.read_disabled_bios = &cik_read_disabled_bios,
 	.read_bios_from_rom = &cik_read_bios_from_rom,
-	.detect_hw_virtualization = cik_detect_hw_virtualization,
 	.read_register = &cik_read_register,
 	.reset = &cik_asic_reset,
 	.set_vga_state = &cik_vga_set_state,
@ -1723,8 +1723,8 @@ static int cik_common_early_init(void *handle)
 			  AMD_PG_SUPPORT_GFX_SMG |
 			  AMD_PG_SUPPORT_GFX_DMG |*/
 			AMD_PG_SUPPORT_UVD |
-			/*AMD_PG_SUPPORT_VCE |
-			  AMD_PG_SUPPORT_CP |
+			AMD_PG_SUPPORT_VCE |
+			/*  AMD_PG_SUPPORT_CP |
 			  AMD_PG_SUPPORT_GDS |
 			  AMD_PG_SUPPORT_RLC_SMU_HS |
 			  AMD_PG_SUPPORT_ACP |
@ -1890,6 +1890,8 @@ static const struct amdgpu_ip_block_version cik_common_ip_block =

 int cik_set_ip_blocks(struct amdgpu_device *adev)
 {
+	cik_detect_hw_virtualization(adev);
+
 	switch (adev->asic_type) {
 	case CHIP_BONAIRE:
 		amdgpu_ip_block_add(adev, &cik_common_ip_block);
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@ -651,7 +651,7 @@ static int cik_sdma_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	ib.ptr[3] = 1;
 	ib.ptr[4] = 0xDEADBEEF;
 	ib.length_dw = 5;
-	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, NULL, &f);
+	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, &f);
 	if (r)
 		goto err1;

--- a/drivers/gpu/drm/amd/include/asic_reg/si/clearstate_si.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/si/clearstate_si.h
--- a/drivers/gpu/drm/amd/amdgpu/cz_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_dpm.c
--- a/drivers/gpu/drm/amd/amdgpu/cz_dpm.h
+++ b/drivers/gpu/drm/amd/amdgpu/cz_dpm.h
@ -1,239 +0,0 @@
-/*
- * Copyright 2014 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- */
-
-#ifndef __CZ_DPM_H__
-#define __CZ_DPM_H__
-
-#include "smu8_fusion.h"
-
-#define CZ_AT_DFLT					30
-#define CZ_NUM_NBPSTATES				4
-#define CZ_NUM_NBPMEMORY_CLOCK				2
-#define CZ_MAX_HARDWARE_POWERLEVELS			8
-#define CZ_MAX_DISPLAY_CLOCK_LEVEL			8
-#define CZ_MAX_DISPLAYPHY_IDS				10
-
-#define PPCZ_VOTINGRIGHTSCLIENTS_DFLT0			0x3FFFC102
-
-#define SMC_RAM_END					0x40000
-
-#define DPMFlags_SCLK_Enabled				0x00000001
-#define DPMFlags_UVD_Enabled				0x00000002
-#define DPMFlags_VCE_Enabled				0x00000004
-#define DPMFlags_ACP_Enabled				0x00000008
-#define DPMFlags_ForceHighestValid			0x40000000
-#define DPMFlags_Debug					0x80000000
-
-/* Do not change the following, it is also defined in SMU8.h */
-#define SMU_EnabledFeatureScoreboard_AcpDpmOn		0x00000001
-#define SMU_EnabledFeatureScoreboard_SclkDpmOn		0x00200000
-#define SMU_EnabledFeatureScoreboard_UvdDpmOn		0x00800000
-#define SMU_EnabledFeatureScoreboard_VceDpmOn		0x01000000
-
-/* temporary solution to SetMinDeepSleepSclk
- * should indicate by display adaptor
- * 10k Hz unit*/
-#define CZ_MIN_DEEP_SLEEP_SCLK				800
-
-enum cz_pt_config_reg_type {
-	CZ_CONFIGREG_MMR = 0,
-	CZ_CONFIGREG_SMC_IND,
-	CZ_CONFIGREG_DIDT_IND,
-	CZ_CONFIGREG_CACHE,
-	CZ_CONFIGREG_MAX
-};
-
-struct cz_pt_config_reg {
-	uint32_t offset;
-	uint32_t mask;
-	uint32_t shift;
-	uint32_t value;
-	enum cz_pt_config_reg_type type;
-};
-
-struct cz_dpm_entry {
-	uint32_t	soft_min_clk;
-	uint32_t	hard_min_clk;
-	uint32_t	soft_max_clk;
-	uint32_t	hard_max_clk;
-};
-
-struct cz_pl {
-	uint32_t sclk;
-	uint8_t vddc_index;
-	uint8_t ds_divider_index;
-	uint8_t ss_divider_index;
-	uint8_t allow_gnb_slow;
-	uint8_t force_nbp_state;
-	uint8_t display_wm;
-	uint8_t vce_wm;
-};
-
-struct cz_ps {
-	struct cz_pl levels[CZ_MAX_HARDWARE_POWERLEVELS];
-	uint32_t num_levels;
-	bool need_dfs_bypass;
-	uint8_t dpm0_pg_nb_ps_lo;
-	uint8_t dpm0_pg_nb_ps_hi;
-	uint8_t dpmx_nb_ps_lo;
-	uint8_t dpmx_nb_ps_hi;
-	bool force_high;
-};
-
-struct cz_displayphy_entry {
-	uint8_t phy_present;
-	uint8_t active_lane_mapping;
-	uint8_t display_conf_type;
-	uint8_t num_active_lanes;
-};
-
-struct cz_displayphy_info {
-	bool phy_access_initialized;
-	struct cz_displayphy_entry entries[CZ_MAX_DISPLAYPHY_IDS];
-};
-
-struct cz_sys_info {
-	uint32_t bootup_uma_clk;
-	uint32_t bootup_sclk;
-	uint32_t dentist_vco_freq;
-	uint32_t nb_dpm_enable;
-	uint32_t nbp_memory_clock[CZ_NUM_NBPMEMORY_CLOCK];
-	uint32_t nbp_n_clock[CZ_NUM_NBPSTATES];
-	uint8_t nbp_voltage_index[CZ_NUM_NBPSTATES];
-	uint32_t display_clock[CZ_MAX_DISPLAY_CLOCK_LEVEL];
-	uint16_t bootup_nb_voltage_index;
-	uint8_t htc_tmp_lmt;
-	uint8_t htc_hyst_lmt;
-	uint32_t uma_channel_number;
-};
-
-struct cz_power_info {
-	uint32_t active_target[CZ_MAX_HARDWARE_POWERLEVELS];
-	struct cz_sys_info sys_info;
-	struct cz_pl boot_pl;
-	bool disable_nb_ps3_in_battery;
-	bool battery_state;
-	uint32_t lowest_valid;
-	uint32_t highest_valid;
-	uint16_t high_voltage_threshold;
-	/* smc offsets */
-	uint32_t sram_end;
-	uint32_t dpm_table_start;
-	uint32_t soft_regs_start;
-	/* dpm SMU tables */
-	uint8_t uvd_level_count;
-	uint8_t vce_level_count;
-	uint8_t acp_level_count;
-	uint32_t fps_high_threshold;
-	uint32_t fps_low_threshold;
-	/* dpm table */
-	uint32_t dpm_flags;
-	struct cz_dpm_entry sclk_dpm;
-	struct cz_dpm_entry uvd_dpm;
-	struct cz_dpm_entry vce_dpm;
-	struct cz_dpm_entry acp_dpm;
-
-	uint8_t uvd_boot_level;
-	uint8_t uvd_interval;
-	uint8_t vce_boot_level;
-	uint8_t vce_interval;
-	uint8_t acp_boot_level;
-	uint8_t acp_interval;
-
-	uint8_t graphics_boot_level;
-	uint8_t graphics_interval;
-	uint8_t graphics_therm_throttle_enable;
-	uint8_t graphics_voltage_change_enable;
-	uint8_t graphics_clk_slow_enable;
-	uint8_t graphics_clk_slow_divider;
-
-	uint32_t low_sclk_interrupt_threshold;
-	bool uvd_power_gated;
-	bool vce_power_gated;
-	bool acp_power_gated;
-
-	uint32_t active_process_mask;
-
-	uint32_t mgcg_cgtt_local0;
-	uint32_t mgcg_cgtt_local1;
-	uint32_t clock_slow_down_step;
-	uint32_t skip_clock_slow_down;
-	bool enable_nb_ps_policy;
-	uint32_t voting_clients;
-	uint32_t voltage_drop_threshold;
-	uint32_t gfx_pg_threshold;
-	uint32_t max_sclk_level;
-	uint32_t max_uvd_level;
-	uint32_t max_vce_level;
-	/* flags */
-	bool didt_enabled;
-	bool video_start;
-	bool cac_enabled;
-	bool bapm_enabled;
-	bool nb_dpm_enabled_by_driver;
-	bool nb_dpm_enabled;
-	bool auto_thermal_throttling_enabled;
-	bool dpm_enabled;
-	bool need_pptable_upload;
-	/* caps */
-	bool caps_cac;
-	bool caps_power_containment;
-	bool caps_sq_ramping;
-	bool caps_db_ramping;
-	bool caps_td_ramping;
-	bool caps_tcp_ramping;
-	bool caps_sclk_throttle_low_notification;
-	bool caps_fps;
-	bool caps_uvd_dpm;
-	bool caps_uvd_pg;
-	bool caps_vce_dpm;
-	bool caps_vce_pg;
-	bool caps_acp_dpm;
-	bool caps_acp_pg;
-	bool caps_stable_power_state;
-	bool caps_enable_dfs_bypass;
-	bool caps_sclk_ds;
-	bool caps_voltage_island;
-	/* power state */
-	struct amdgpu_ps current_rps;
-	struct cz_ps current_ps;
-	struct amdgpu_ps requested_rps;
-	struct cz_ps requested_ps;
-
-	bool uvd_power_down;
-	bool vce_power_down;
-	bool acp_power_down;
-
-	bool uvd_dynamic_pg;
-};
-
-/* cz_smc.c */
-uint32_t cz_get_argument(struct amdgpu_device *adev);
-int cz_send_msg_to_smc(struct amdgpu_device *adev, uint16_t msg);
-int cz_send_msg_to_smc_with_parameter(struct amdgpu_device *adev,
-			uint16_t msg, uint32_t parameter);
-int cz_read_smc_sram_dword(struct amdgpu_device *adev,
-			uint32_t smc_address, uint32_t *value, uint32_t limit);
-int cz_smu_upload_pptable(struct amdgpu_device *adev);
-int cz_smu_download_pptable(struct amdgpu_device *adev, void **table);
-#endif
--- a/drivers/gpu/drm/amd/amdgpu/cz_smc.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_smc.c
@ -1,995 +0,0 @@
-/*
- * Copyright 2014 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- */
-#include <linux/firmware.h>
-#include "drmP.h"
-#include "amdgpu.h"
-#include "smu8.h"
-#include "smu8_fusion.h"
-#include "cz_ppsmc.h"
-#include "cz_smumgr.h"
-#include "smu_ucode_xfer_cz.h"
-#include "amdgpu_ucode.h"
-#include "cz_dpm.h"
-#include "vi_dpm.h"
-
-#include "smu/smu_8_0_d.h"
-#include "smu/smu_8_0_sh_mask.h"
-#include "gca/gfx_8_0_d.h"
-#include "gca/gfx_8_0_sh_mask.h"
-
-uint32_t cz_get_argument(struct amdgpu_device *adev)
-{
-	return RREG32(mmSMU_MP1_SRBM2P_ARG_0);
-}
-
-static struct cz_smu_private_data *cz_smu_get_priv(struct amdgpu_device *adev)
-{
-	struct cz_smu_private_data *priv =
-			(struct cz_smu_private_data *)(adev->smu.priv);
-
-	return priv;
-}
-
-static int cz_send_msg_to_smc_async(struct amdgpu_device *adev, u16 msg)
-{
-	int i;
-	u32 content = 0, tmp;
-
-	for (i = 0; i < adev->usec_timeout; i++) {
-		tmp = REG_GET_FIELD(RREG32(mmSMU_MP1_SRBM2P_RESP_0),
-				SMU_MP1_SRBM2P_RESP_0, CONTENT);
-		if (content != tmp)
-			break;
-		udelay(1);
-	}
-
-	/* timeout means wrong logic*/
-	if (i == adev->usec_timeout)
-		return -EINVAL;
-
-	WREG32(mmSMU_MP1_SRBM2P_RESP_0, 0);
-	WREG32(mmSMU_MP1_SRBM2P_MSG_0, msg);
-
-	return 0;
-}
-
-int cz_send_msg_to_smc(struct amdgpu_device *adev, u16 msg)
-{
-	int i;
-	u32 content = 0, tmp = 0;
-
-	if (cz_send_msg_to_smc_async(adev, msg))
-		return -EINVAL;
-
-	for (i = 0; i < adev->usec_timeout; i++) {
-		tmp = REG_GET_FIELD(RREG32(mmSMU_MP1_SRBM2P_RESP_0),
-				SMU_MP1_SRBM2P_RESP_0, CONTENT);
-		if (content != tmp)
-			break;
-		udelay(1);
-	}
-
-	/* timeout means wrong logic*/
-	if (i == adev->usec_timeout)
-		return -EINVAL;
-
-	if (PPSMC_Result_OK != tmp) {
-		dev_err(adev->dev, "SMC Failed to send Message.\n");
-		return -EINVAL;
-	}
-
-	return 0;
-}
-
-int cz_send_msg_to_smc_with_parameter(struct amdgpu_device *adev,
-						u16 msg, u32 parameter)
-{
-	WREG32(mmSMU_MP1_SRBM2P_ARG_0, parameter);
-	return cz_send_msg_to_smc(adev, msg);
-}
-
-static int cz_set_smc_sram_address(struct amdgpu_device *adev,
-						u32 smc_address, u32 limit)
-{
-	if (smc_address & 3)
-		return -EINVAL;
-	if ((smc_address + 3) > limit)
-		return -EINVAL;
-
-	WREG32(mmMP0PUB_IND_INDEX_0, SMN_MP1_SRAM_START_ADDR + smc_address);
-
-	return 0;
-}
-
-int cz_read_smc_sram_dword(struct amdgpu_device *adev, u32 smc_address,
-						u32 *value, u32 limit)
-{
-	int ret;
-
-	ret = cz_set_smc_sram_address(adev, smc_address, limit);
-	if (ret)
-		return ret;
-
-	*value = RREG32(mmMP0PUB_IND_DATA_0);
-
-	return 0;
-}
-
-static int cz_write_smc_sram_dword(struct amdgpu_device *adev, u32 smc_address,
-						u32 value, u32 limit)
-{
-	int ret;
-
-	ret = cz_set_smc_sram_address(adev, smc_address, limit);
-	if (ret)
-		return ret;
-
-	WREG32(mmMP0PUB_IND_DATA_0, value);
-
-	return 0;
-}
-
-static int cz_smu_request_load_fw(struct amdgpu_device *adev)
-{
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-
-	uint32_t smc_addr = SMU8_FIRMWARE_HEADER_LOCATION +
-			offsetof(struct SMU8_Firmware_Header, UcodeLoadStatus);
-
-	cz_write_smc_sram_dword(adev, smc_addr, 0, smc_addr + 4);
-
-	/*prepare toc buffers*/
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_DriverDramAddrHi,
-				priv->toc_buffer.mc_addr_high);
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_DriverDramAddrLo,
-				priv->toc_buffer.mc_addr_low);
-	cz_send_msg_to_smc(adev, PPSMC_MSG_InitJobs);
-
-	/*execute jobs*/
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_ExecuteJob,
-				priv->toc_entry_aram);
-
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_ExecuteJob,
-				priv->toc_entry_power_profiling_index);
-
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_ExecuteJob,
-				priv->toc_entry_initialize_index);
-
-	return 0;
-}
-
-/*
- *Check if the FW has been loaded, SMU will not return if loading
- *has not finished.
- */
-static int cz_smu_check_fw_load_finish(struct amdgpu_device *adev,
-						uint32_t fw_mask)
-{
-	int i;
-	uint32_t index = SMN_MP1_SRAM_START_ADDR +
-			SMU8_FIRMWARE_HEADER_LOCATION +
-			offsetof(struct SMU8_Firmware_Header, UcodeLoadStatus);
-
-	WREG32(mmMP0PUB_IND_INDEX, index);
-
-	for (i = 0; i < adev->usec_timeout; i++) {
-		if (fw_mask == (RREG32(mmMP0PUB_IND_DATA) & fw_mask))
-			break;
-		udelay(1);
-	}
-
-	if (i >= adev->usec_timeout) {
-		dev_err(adev->dev,
-		"SMU check loaded firmware failed, expecting 0x%x, getting 0x%x",
-		fw_mask, RREG32(mmMP0PUB_IND_DATA));
-		return -EINVAL;
-	}
-
-	return 0;
-}
-
-/*
- * interfaces for different ip blocks to check firmware loading status
- * 0 for success otherwise failed
- */
-static int cz_smu_check_finished(struct amdgpu_device *adev,
-							enum AMDGPU_UCODE_ID id)
-{
-	switch (id) {
-	case AMDGPU_UCODE_ID_SDMA0:
-		if (adev->smu.fw_flags & AMDGPU_SDMA0_UCODE_LOADED)
-			return 0;
-		break;
-	case AMDGPU_UCODE_ID_SDMA1:
-		if (adev->smu.fw_flags & AMDGPU_SDMA1_UCODE_LOADED)
-			return 0;
-		break;
-	case AMDGPU_UCODE_ID_CP_CE:
-		if (adev->smu.fw_flags & AMDGPU_CPCE_UCODE_LOADED)
-			return 0;
-		break;
-	case AMDGPU_UCODE_ID_CP_PFP:
-		if (adev->smu.fw_flags & AMDGPU_CPPFP_UCODE_LOADED)
-			return 0;
-	case AMDGPU_UCODE_ID_CP_ME:
-		if (adev->smu.fw_flags & AMDGPU_CPME_UCODE_LOADED)
-			return 0;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC1:
-		if (adev->smu.fw_flags & AMDGPU_CPMEC1_UCODE_LOADED)
-			return 0;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC2:
-		if (adev->smu.fw_flags & AMDGPU_CPMEC2_UCODE_LOADED)
-			return 0;
-		break;
-	case AMDGPU_UCODE_ID_RLC_G:
-		if (adev->smu.fw_flags & AMDGPU_CPRLC_UCODE_LOADED)
-			return 0;
-		break;
-	case AMDGPU_UCODE_ID_MAXIMUM:
-	default:
-		break;
-	}
-
-	return 1;
-}
-
-static int cz_load_mec_firmware(struct amdgpu_device *adev)
-{
-	struct amdgpu_firmware_info *ucode =
-				&adev->firmware.ucode[AMDGPU_UCODE_ID_CP_MEC1];
-	uint32_t reg_data;
-	uint32_t tmp;
-
-	if (ucode->fw == NULL)
-		return -EINVAL;
-
-	/* Disable MEC parsing/prefetching */
-	tmp = RREG32(mmCP_MEC_CNTL);
-	tmp = REG_SET_FIELD(tmp, CP_MEC_CNTL, MEC_ME1_HALT, 1);
-	tmp = REG_SET_FIELD(tmp, CP_MEC_CNTL, MEC_ME2_HALT, 1);
-	WREG32(mmCP_MEC_CNTL, tmp);
-
-	tmp = RREG32(mmCP_CPC_IC_BASE_CNTL);
-	tmp = REG_SET_FIELD(tmp, CP_CPC_IC_BASE_CNTL, VMID, 0);
-	tmp = REG_SET_FIELD(tmp, CP_CPC_IC_BASE_CNTL, ATC, 0);
-	tmp = REG_SET_FIELD(tmp, CP_CPC_IC_BASE_CNTL, CACHE_POLICY, 0);
-	tmp = REG_SET_FIELD(tmp, CP_CPC_IC_BASE_CNTL, MTYPE, 1);
-	WREG32(mmCP_CPC_IC_BASE_CNTL, tmp);
-
-	reg_data = lower_32_bits(ucode->mc_addr) &
-			REG_FIELD_MASK(CP_CPC_IC_BASE_LO, IC_BASE_LO);
-	WREG32(mmCP_CPC_IC_BASE_LO, reg_data);
-
-	reg_data = upper_32_bits(ucode->mc_addr) &
-			REG_FIELD_MASK(CP_CPC_IC_BASE_HI, IC_BASE_HI);
-	WREG32(mmCP_CPC_IC_BASE_HI, reg_data);
-
-	return 0;
-}
-
-int cz_smu_start(struct amdgpu_device *adev)
-{
-	int ret = 0;
-
-	uint32_t fw_to_check = UCODE_ID_RLC_G_MASK |
-				UCODE_ID_SDMA0_MASK |
-				UCODE_ID_SDMA1_MASK |
-				UCODE_ID_CP_CE_MASK |
-				UCODE_ID_CP_ME_MASK |
-				UCODE_ID_CP_PFP_MASK |
-				UCODE_ID_CP_MEC_JT1_MASK |
-				UCODE_ID_CP_MEC_JT2_MASK;
-
-	if (adev->asic_type == CHIP_STONEY)
-		fw_to_check &= ~(UCODE_ID_SDMA1_MASK | UCODE_ID_CP_MEC_JT2_MASK);
-
-	cz_smu_request_load_fw(adev);
-	ret = cz_smu_check_fw_load_finish(adev, fw_to_check);
-	if (ret)
-		return ret;
-
-	/* manually load MEC firmware for CZ */
-	if (adev->asic_type == CHIP_CARRIZO || adev->asic_type == CHIP_STONEY) {
-		ret = cz_load_mec_firmware(adev);
-		if (ret) {
-			dev_err(adev->dev, "(%d) Mec Firmware load failed\n", ret);
-			return ret;
-		}
-	}
-
-	/* setup fw load flag */
-	adev->smu.fw_flags = AMDGPU_SDMA0_UCODE_LOADED |
-				AMDGPU_SDMA1_UCODE_LOADED |
-				AMDGPU_CPCE_UCODE_LOADED |
-				AMDGPU_CPPFP_UCODE_LOADED |
-				AMDGPU_CPME_UCODE_LOADED |
-				AMDGPU_CPMEC1_UCODE_LOADED |
-				AMDGPU_CPMEC2_UCODE_LOADED |
-				AMDGPU_CPRLC_UCODE_LOADED;
-
-	if (adev->asic_type == CHIP_STONEY)
-		adev->smu.fw_flags &= ~(AMDGPU_SDMA1_UCODE_LOADED | AMDGPU_CPMEC2_UCODE_LOADED);
-
-	return ret;
-}
-
-static uint32_t cz_convert_fw_type(uint32_t fw_type)
-{
-	enum AMDGPU_UCODE_ID result = AMDGPU_UCODE_ID_MAXIMUM;
-
-	switch (fw_type) {
-	case UCODE_ID_SDMA0:
-		result = AMDGPU_UCODE_ID_SDMA0;
-		break;
-	case UCODE_ID_SDMA1:
-		result = AMDGPU_UCODE_ID_SDMA1;
-		break;
-	case UCODE_ID_CP_CE:
-		result = AMDGPU_UCODE_ID_CP_CE;
-		break;
-	case UCODE_ID_CP_PFP:
-		result = AMDGPU_UCODE_ID_CP_PFP;
-		break;
-	case UCODE_ID_CP_ME:
-		result = AMDGPU_UCODE_ID_CP_ME;
-		break;
-	case UCODE_ID_CP_MEC_JT1:
-	case UCODE_ID_CP_MEC_JT2:
-		result = AMDGPU_UCODE_ID_CP_MEC1;
-		break;
-	case UCODE_ID_RLC_G:
-		result = AMDGPU_UCODE_ID_RLC_G;
-		break;
-	default:
-		DRM_ERROR("UCode type is out of range!");
-	}
-
-	return result;
-}
-
-static uint8_t cz_smu_translate_firmware_enum_to_arg(
-			enum cz_scratch_entry firmware_enum)
-{
-	uint8_t ret = 0;
-
-	switch (firmware_enum) {
-	case CZ_SCRATCH_ENTRY_UCODE_ID_SDMA0:
-		ret = UCODE_ID_SDMA0;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_SDMA1:
-		ret = UCODE_ID_SDMA1;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_CP_CE:
-		ret = UCODE_ID_CP_CE;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_CP_PFP:
-		ret = UCODE_ID_CP_PFP;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_CP_ME:
-		ret = UCODE_ID_CP_ME;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT1:
-		ret = UCODE_ID_CP_MEC_JT1;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT2:
-		ret = UCODE_ID_CP_MEC_JT2;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_GMCON_RENG:
-		ret = UCODE_ID_GMCON_RENG;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_RLC_G:
-		ret = UCODE_ID_RLC_G;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SCRATCH:
-		ret = UCODE_ID_RLC_SCRATCH;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_ARAM:
-		ret = UCODE_ID_RLC_SRM_ARAM;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_DRAM:
-		ret = UCODE_ID_RLC_SRM_DRAM;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_DMCU_ERAM:
-		ret = UCODE_ID_DMCU_ERAM;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_DMCU_IRAM:
-		ret = UCODE_ID_DMCU_IRAM;
-		break;
-	case CZ_SCRATCH_ENTRY_UCODE_ID_POWER_PROFILING:
-		ret = TASK_ARG_INIT_MM_PWR_LOG;
-		break;
-	case CZ_SCRATCH_ENTRY_DATA_ID_SDMA_HALT:
-	case CZ_SCRATCH_ENTRY_DATA_ID_SYS_CLOCKGATING:
-	case CZ_SCRATCH_ENTRY_DATA_ID_SDMA_RING_REGS:
-	case CZ_SCRATCH_ENTRY_DATA_ID_NONGFX_REINIT:
-	case CZ_SCRATCH_ENTRY_DATA_ID_SDMA_START:
-	case CZ_SCRATCH_ENTRY_DATA_ID_IH_REGISTERS:
-		ret = TASK_ARG_REG_MMIO;
-		break;
-	case CZ_SCRATCH_ENTRY_SMU8_FUSION_CLKTABLE:
-		ret = TASK_ARG_INIT_CLK_TABLE;
-		break;
-	}
-
-	return ret;
-}
-
-static int cz_smu_populate_single_firmware_entry(struct amdgpu_device *adev,
-					enum cz_scratch_entry firmware_enum,
-					struct cz_buffer_entry *entry)
-{
-	uint64_t gpu_addr;
-	uint32_t data_size;
-	uint8_t ucode_id = cz_smu_translate_firmware_enum_to_arg(firmware_enum);
-	enum AMDGPU_UCODE_ID id = cz_convert_fw_type(ucode_id);
-	struct amdgpu_firmware_info *ucode = &adev->firmware.ucode[id];
-	const struct gfx_firmware_header_v1_0 *header;
-
-	if (ucode->fw == NULL)
-		return -EINVAL;
-
-	gpu_addr  = ucode->mc_addr;
-	header = (const struct gfx_firmware_header_v1_0 *)ucode->fw->data;
-	data_size = le32_to_cpu(header->header.ucode_size_bytes);
-
-	if ((firmware_enum == CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT1) ||
-	    (firmware_enum == CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT2)) {
-		gpu_addr += le32_to_cpu(header->jt_offset) << 2;
-		data_size = le32_to_cpu(header->jt_size) << 2;
-	}
-
-	entry->mc_addr_low = lower_32_bits(gpu_addr);
-	entry->mc_addr_high = upper_32_bits(gpu_addr);
-	entry->data_size = data_size;
-	entry->firmware_ID = firmware_enum;
-
-	return 0;
-}
-
-static int cz_smu_populate_single_scratch_entry(struct amdgpu_device *adev,
-					enum cz_scratch_entry scratch_type,
-					uint32_t size_in_byte,
-					struct cz_buffer_entry *entry)
-{
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-	uint64_t mc_addr = (((uint64_t) priv->smu_buffer.mc_addr_high) << 32) |
-						priv->smu_buffer.mc_addr_low;
-	mc_addr += size_in_byte;
-
-	priv->smu_buffer_used_bytes += size_in_byte;
-	entry->data_size = size_in_byte;
-	entry->kaddr = priv->smu_buffer.kaddr + priv->smu_buffer_used_bytes;
-	entry->mc_addr_low = lower_32_bits(mc_addr);
-	entry->mc_addr_high = upper_32_bits(mc_addr);
-	entry->firmware_ID = scratch_type;
-
-	return 0;
-}
-
-static int cz_smu_populate_single_ucode_load_task(struct amdgpu_device *adev,
-						enum cz_scratch_entry firmware_enum,
-						bool is_last)
-{
-	uint8_t i;
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-	struct TOC *toc = (struct TOC *)priv->toc_buffer.kaddr;
-	struct SMU_Task *task = &toc->tasks[priv->toc_entry_used_count++];
-
-	task->type = TASK_TYPE_UCODE_LOAD;
-	task->arg = cz_smu_translate_firmware_enum_to_arg(firmware_enum);
-	task->next = is_last ? END_OF_TASK_LIST : priv->toc_entry_used_count;
-
-	for (i = 0; i < priv->driver_buffer_length; i++)
-		if (priv->driver_buffer[i].firmware_ID == firmware_enum)
-			break;
-
-	if (i >= priv->driver_buffer_length) {
-		dev_err(adev->dev, "Invalid Firmware Type\n");
-		return -EINVAL;
-	}
-
-	task->addr.low = priv->driver_buffer[i].mc_addr_low;
-	task->addr.high = priv->driver_buffer[i].mc_addr_high;
-	task->size_bytes = priv->driver_buffer[i].data_size;
-
-	return 0;
-}
-
-static int cz_smu_populate_single_scratch_task(struct amdgpu_device *adev,
-						enum cz_scratch_entry firmware_enum,
-						uint8_t type, bool is_last)
-{
-	uint8_t i;
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-	struct TOC *toc = (struct TOC *)priv->toc_buffer.kaddr;
-	struct SMU_Task *task = &toc->tasks[priv->toc_entry_used_count++];
-
-	task->type = type;
-	task->arg = cz_smu_translate_firmware_enum_to_arg(firmware_enum);
-	task->next = is_last ? END_OF_TASK_LIST : priv->toc_entry_used_count;
-
-	for (i = 0; i < priv->scratch_buffer_length; i++)
-		if (priv->scratch_buffer[i].firmware_ID == firmware_enum)
-			break;
-
-	if (i >= priv->scratch_buffer_length) {
-		dev_err(adev->dev, "Invalid Firmware Type\n");
-		return -EINVAL;
-	}
-
-	task->addr.low = priv->scratch_buffer[i].mc_addr_low;
-	task->addr.high = priv->scratch_buffer[i].mc_addr_high;
-	task->size_bytes = priv->scratch_buffer[i].data_size;
-
-	if (CZ_SCRATCH_ENTRY_DATA_ID_IH_REGISTERS == firmware_enum) {
-		struct cz_ih_meta_data *pIHReg_restore =
-			(struct cz_ih_meta_data *)priv->scratch_buffer[i].kaddr;
-		pIHReg_restore->command =
-			METADATA_CMD_MODE0 | METADATA_PERFORM_ON_LOAD;
-	}
-
-	return 0;
-}
-
-static int cz_smu_construct_toc_for_rlc_aram_save(struct amdgpu_device *adev)
-{
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-	priv->toc_entry_aram = priv->toc_entry_used_count;
-	cz_smu_populate_single_scratch_task(adev,
-			CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_ARAM,
-			TASK_TYPE_UCODE_SAVE, true);
-
-	return 0;
-}
-
-static int cz_smu_construct_toc_for_vddgfx_enter(struct amdgpu_device *adev)
-{
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-	struct TOC *toc = (struct TOC *)priv->toc_buffer.kaddr;
-
-	toc->JobList[JOB_GFX_SAVE] = (uint8_t)priv->toc_entry_used_count;
-	cz_smu_populate_single_scratch_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SCRATCH,
-				TASK_TYPE_UCODE_SAVE, false);
-	cz_smu_populate_single_scratch_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_DRAM,
-				TASK_TYPE_UCODE_SAVE, true);
-
-	return 0;
-}
-
-static int cz_smu_construct_toc_for_vddgfx_exit(struct amdgpu_device *adev)
-{
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-	struct TOC *toc = (struct TOC *)priv->toc_buffer.kaddr;
-
-	toc->JobList[JOB_GFX_RESTORE] = (uint8_t)priv->toc_entry_used_count;
-
-	/* populate ucode */
-	if (adev->firmware.smu_load) {
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_CE, false);
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_PFP, false);
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_ME, false);
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT1, false);
-		if (adev->asic_type == CHIP_STONEY) {
-			cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT1, false);
-		} else {
-			cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT2, false);
-		}
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_G, false);
-	}
-
-	/* populate scratch */
-	cz_smu_populate_single_scratch_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SCRATCH,
-				TASK_TYPE_UCODE_LOAD, false);
-	cz_smu_populate_single_scratch_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_ARAM,
-				TASK_TYPE_UCODE_LOAD, false);
-	cz_smu_populate_single_scratch_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_DRAM,
-				TASK_TYPE_UCODE_LOAD, true);
-
-	return 0;
-}
-
-static int cz_smu_construct_toc_for_power_profiling(struct amdgpu_device *adev)
-{
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-
-	priv->toc_entry_power_profiling_index = priv->toc_entry_used_count;
-
-	cz_smu_populate_single_scratch_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_POWER_PROFILING,
-				TASK_TYPE_INITIALIZE, true);
-	return 0;
-}
-
-static int cz_smu_construct_toc_for_bootup(struct amdgpu_device *adev)
-{
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-
-	priv->toc_entry_initialize_index = priv->toc_entry_used_count;
-
-	if (adev->firmware.smu_load) {
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_SDMA0, false);
-		if (adev->asic_type == CHIP_STONEY) {
-			cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_SDMA0, false);
-		} else {
-			cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_SDMA1, false);
-		}
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_CE, false);
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_PFP, false);
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_ME, false);
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT1, false);
-		if (adev->asic_type == CHIP_STONEY) {
-			cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT1, false);
-		} else {
-			cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT2, false);
-		}
-		cz_smu_populate_single_ucode_load_task(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_G, true);
-	}
-
-	return 0;
-}
-
-static int cz_smu_construct_toc_for_clock_table(struct amdgpu_device *adev)
-{
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-
-	priv->toc_entry_clock_table = priv->toc_entry_used_count;
-
-	cz_smu_populate_single_scratch_task(adev,
-				CZ_SCRATCH_ENTRY_SMU8_FUSION_CLKTABLE,
-				TASK_TYPE_INITIALIZE, true);
-
-	return 0;
-}
-
-static int cz_smu_initialize_toc_empty_job_list(struct amdgpu_device *adev)
-{
-	int i;
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-	struct TOC *toc = (struct TOC *)priv->toc_buffer.kaddr;
-
-	for (i = 0; i < NUM_JOBLIST_ENTRIES; i++)
-		toc->JobList[i] = (uint8_t)IGNORE_JOB;
-
-	return 0;
-}
-
-/*
- * cz smu uninitialization
- */
-int cz_smu_fini(struct amdgpu_device *adev)
-{
-	amdgpu_bo_unref(&adev->smu.toc_buf);
-	amdgpu_bo_unref(&adev->smu.smu_buf);
-	kfree(adev->smu.priv);
-	adev->smu.priv = NULL;
-	if (adev->firmware.smu_load)
-		amdgpu_ucode_fini_bo(adev);
-
-	return 0;
-}
-
-int cz_smu_download_pptable(struct amdgpu_device *adev, void **table)
-{
-	uint8_t i;
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-
-	for (i = 0; i < priv->scratch_buffer_length; i++)
-		if (priv->scratch_buffer[i].firmware_ID ==
-				CZ_SCRATCH_ENTRY_SMU8_FUSION_CLKTABLE)
-			break;
-
-	if (i >= priv->scratch_buffer_length) {
-		dev_err(adev->dev, "Invalid Scratch Type\n");
-		return -EINVAL;
-	}
-
-	*table = (struct SMU8_Fusion_ClkTable *)priv->scratch_buffer[i].kaddr;
-
-	/* prepare buffer for pptable */
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_SetClkTableAddrHi,
-				priv->scratch_buffer[i].mc_addr_high);
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_SetClkTableAddrLo,
-				priv->scratch_buffer[i].mc_addr_low);
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_ExecuteJob,
-				priv->toc_entry_clock_table);
-
-	/* actual downloading */
-	cz_send_msg_to_smc(adev, PPSMC_MSG_ClkTableXferToDram);
-
-	return 0;
-}
-
-int cz_smu_upload_pptable(struct amdgpu_device *adev)
-{
-	uint8_t i;
-	struct cz_smu_private_data *priv = cz_smu_get_priv(adev);
-
-	for (i = 0; i < priv->scratch_buffer_length; i++)
-		if (priv->scratch_buffer[i].firmware_ID ==
-				CZ_SCRATCH_ENTRY_SMU8_FUSION_CLKTABLE)
-			break;
-
-	if (i >= priv->scratch_buffer_length) {
-		dev_err(adev->dev, "Invalid Scratch Type\n");
-		return -EINVAL;
-	}
-
-	/* prepare SMU */
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_SetClkTableAddrHi,
-				priv->scratch_buffer[i].mc_addr_high);
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_SetClkTableAddrLo,
-				priv->scratch_buffer[i].mc_addr_low);
-	cz_send_msg_to_smc_with_parameter(adev,
-				PPSMC_MSG_ExecuteJob,
-				priv->toc_entry_clock_table);
-
-	/* actual uploading */
-	cz_send_msg_to_smc(adev, PPSMC_MSG_ClkTableXferToSmu);
-
-	return 0;
-}
-
-/*
- * cz smumgr functions initialization
- */
-static const struct amdgpu_smumgr_funcs cz_smumgr_funcs = {
-	.check_fw_load_finish = cz_smu_check_finished,
-	.request_smu_load_fw = NULL,
-	.request_smu_specific_fw = NULL,
-};
-
-/*
- * cz smu initialization
- */
-int cz_smu_init(struct amdgpu_device *adev)
-{
-	int ret = -EINVAL;
-	uint64_t mc_addr = 0;
-	struct amdgpu_bo **toc_buf = &adev->smu.toc_buf;
-	struct amdgpu_bo **smu_buf = &adev->smu.smu_buf;
-	void *toc_buf_ptr = NULL;
-	void *smu_buf_ptr = NULL;
-
-	struct cz_smu_private_data *priv =
-		kzalloc(sizeof(struct cz_smu_private_data), GFP_KERNEL);
-	if (priv == NULL)
-		return -ENOMEM;
-
-	/* allocate firmware buffers */
-	if (adev->firmware.smu_load)
-		amdgpu_ucode_init_bo(adev);
-
-	adev->smu.priv = priv;
-	adev->smu.fw_flags = 0;
-	priv->toc_buffer.data_size = 4096;
-
-	priv->smu_buffer.data_size =
-				ALIGN(UCODE_ID_RLC_SCRATCH_SIZE_BYTE, 32) +
-				ALIGN(UCODE_ID_RLC_SRM_ARAM_SIZE_BYTE, 32) +
-				ALIGN(UCODE_ID_RLC_SRM_DRAM_SIZE_BYTE, 32) +
-				ALIGN(sizeof(struct SMU8_MultimediaPowerLogData), 32) +
-				ALIGN(sizeof(struct SMU8_Fusion_ClkTable), 32);
-
-	/* prepare toc buffer and smu buffer:
-	* 1. create amdgpu_bo for toc buffer and smu buffer
-	* 2. pin mc address
-	* 3. map kernel virtual address
-	*/
-	ret = amdgpu_bo_create(adev, priv->toc_buffer.data_size, PAGE_SIZE,
-			       true, AMDGPU_GEM_DOMAIN_GTT, 0, NULL, NULL,
-			       toc_buf);
-
-	if (ret) {
-		dev_err(adev->dev, "(%d) SMC TOC buffer allocation failed\n", ret);
-		return ret;
-	}
-
-	ret = amdgpu_bo_create(adev, priv->smu_buffer.data_size, PAGE_SIZE,
-			       true, AMDGPU_GEM_DOMAIN_GTT, 0, NULL, NULL,
-			       smu_buf);
-
-	if (ret) {
-		dev_err(adev->dev, "(%d) SMC Internal buffer allocation failed\n", ret);
-		return ret;
-	}
-
-	/* toc buffer reserve/pin/map */
-	ret = amdgpu_bo_reserve(adev->smu.toc_buf, false);
-	if (ret) {
-		amdgpu_bo_unref(&adev->smu.toc_buf);
-		dev_err(adev->dev, "(%d) SMC TOC buffer reserve failed\n", ret);
-		return ret;
-	}
-
-	ret = amdgpu_bo_pin(adev->smu.toc_buf, AMDGPU_GEM_DOMAIN_GTT, &mc_addr);
-	if (ret) {
-		amdgpu_bo_unreserve(adev->smu.toc_buf);
-		amdgpu_bo_unref(&adev->smu.toc_buf);
-		dev_err(adev->dev, "(%d) SMC TOC buffer pin failed\n", ret);
-		return ret;
-	}
-
-	ret = amdgpu_bo_kmap(*toc_buf, &toc_buf_ptr);
-	if (ret)
-		goto smu_init_failed;
-
-	amdgpu_bo_unreserve(adev->smu.toc_buf);
-
-	priv->toc_buffer.mc_addr_low = lower_32_bits(mc_addr);
-	priv->toc_buffer.mc_addr_high = upper_32_bits(mc_addr);
-	priv->toc_buffer.kaddr = toc_buf_ptr;
-
-	/* smu buffer reserve/pin/map */
-	ret = amdgpu_bo_reserve(adev->smu.smu_buf, false);
-	if (ret) {
-		amdgpu_bo_unref(&adev->smu.smu_buf);
-		dev_err(adev->dev, "(%d) SMC Internal buffer reserve failed\n", ret);
-		return ret;
-	}
-
-	ret = amdgpu_bo_pin(adev->smu.smu_buf, AMDGPU_GEM_DOMAIN_GTT, &mc_addr);
-	if (ret) {
-		amdgpu_bo_unreserve(adev->smu.smu_buf);
-		amdgpu_bo_unref(&adev->smu.smu_buf);
-		dev_err(adev->dev, "(%d) SMC Internal buffer pin failed\n", ret);
-		return ret;
-	}
-
-	ret = amdgpu_bo_kmap(*smu_buf, &smu_buf_ptr);
-	if (ret)
-		goto smu_init_failed;
-
-	amdgpu_bo_unreserve(adev->smu.smu_buf);
-
-	priv->smu_buffer.mc_addr_low = lower_32_bits(mc_addr);
-	priv->smu_buffer.mc_addr_high = upper_32_bits(mc_addr);
-	priv->smu_buffer.kaddr = smu_buf_ptr;
-
-	if (adev->firmware.smu_load) {
-		if (cz_smu_populate_single_firmware_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_SDMA0,
-				&priv->driver_buffer[priv->driver_buffer_length++]))
-			goto smu_init_failed;
-
-		if (adev->asic_type == CHIP_STONEY) {
-			if (cz_smu_populate_single_firmware_entry(adev,
-					CZ_SCRATCH_ENTRY_UCODE_ID_SDMA0,
-					&priv->driver_buffer[priv->driver_buffer_length++]))
-				goto smu_init_failed;
-		} else {
-			if (cz_smu_populate_single_firmware_entry(adev,
-					CZ_SCRATCH_ENTRY_UCODE_ID_SDMA1,
-					&priv->driver_buffer[priv->driver_buffer_length++]))
-				goto smu_init_failed;
-		}
-		if (cz_smu_populate_single_firmware_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_CE,
-				&priv->driver_buffer[priv->driver_buffer_length++]))
-			goto smu_init_failed;
-		if (cz_smu_populate_single_firmware_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_PFP,
-				&priv->driver_buffer[priv->driver_buffer_length++]))
-			goto smu_init_failed;
-		if (cz_smu_populate_single_firmware_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_ME,
-				&priv->driver_buffer[priv->driver_buffer_length++]))
-			goto smu_init_failed;
-		if (cz_smu_populate_single_firmware_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT1,
-				&priv->driver_buffer[priv->driver_buffer_length++]))
-			goto smu_init_failed;
-		if (adev->asic_type == CHIP_STONEY) {
-			if (cz_smu_populate_single_firmware_entry(adev,
-					CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT1,
-					&priv->driver_buffer[priv->driver_buffer_length++]))
-				goto smu_init_failed;
-		} else {
-			if (cz_smu_populate_single_firmware_entry(adev,
-					CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT2,
-					&priv->driver_buffer[priv->driver_buffer_length++]))
-				goto smu_init_failed;
-		}
-		if (cz_smu_populate_single_firmware_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_G,
-				&priv->driver_buffer[priv->driver_buffer_length++]))
-			goto smu_init_failed;
-	}
-
-	if (cz_smu_populate_single_scratch_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SCRATCH,
-				UCODE_ID_RLC_SCRATCH_SIZE_BYTE,
-				&priv->scratch_buffer[priv->scratch_buffer_length++]))
-		goto smu_init_failed;
-	if (cz_smu_populate_single_scratch_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_ARAM,
-				UCODE_ID_RLC_SRM_ARAM_SIZE_BYTE,
-				&priv->scratch_buffer[priv->scratch_buffer_length++]))
-		goto smu_init_failed;
-	if (cz_smu_populate_single_scratch_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_DRAM,
-				UCODE_ID_RLC_SRM_DRAM_SIZE_BYTE,
-				&priv->scratch_buffer[priv->scratch_buffer_length++]))
-		goto smu_init_failed;
-	if (cz_smu_populate_single_scratch_entry(adev,
-				CZ_SCRATCH_ENTRY_UCODE_ID_POWER_PROFILING,
-				sizeof(struct SMU8_MultimediaPowerLogData),
-				&priv->scratch_buffer[priv->scratch_buffer_length++]))
-		goto smu_init_failed;
-	if (cz_smu_populate_single_scratch_entry(adev,
-				CZ_SCRATCH_ENTRY_SMU8_FUSION_CLKTABLE,
-				sizeof(struct SMU8_Fusion_ClkTable),
-				&priv->scratch_buffer[priv->scratch_buffer_length++]))
-		goto smu_init_failed;
-
-	cz_smu_initialize_toc_empty_job_list(adev);
-	cz_smu_construct_toc_for_rlc_aram_save(adev);
-	cz_smu_construct_toc_for_vddgfx_enter(adev);
-	cz_smu_construct_toc_for_vddgfx_exit(adev);
-	cz_smu_construct_toc_for_power_profiling(adev);
-	cz_smu_construct_toc_for_bootup(adev);
-	cz_smu_construct_toc_for_clock_table(adev);
-	/* init the smumgr functions */
-	adev->smu.smumgr_funcs = &cz_smumgr_funcs;
-
-	return 0;
-
-smu_init_failed:
-	amdgpu_bo_unref(toc_buf);
-	amdgpu_bo_unref(smu_buf);
-
-	return ret;
-}
--- a/drivers/gpu/drm/amd/amdgpu/cz_smumgr.h
+++ b/drivers/gpu/drm/amd/amdgpu/cz_smumgr.h
@ -1,94 +0,0 @@
-/*
- * Copyright 2014 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- */
-#ifndef __CZ_SMC_H__
-#define __CZ_SMC_H__
-
-#define MAX_NUM_FIRMWARE                        8
-#define MAX_NUM_SCRATCH                         11
-#define CZ_SCRATCH_SIZE_NONGFX_CLOCKGATING      1024
-#define CZ_SCRATCH_SIZE_NONGFX_GOLDENSETTING    2048
-#define CZ_SCRATCH_SIZE_SDMA_METADATA           1024
-#define CZ_SCRATCH_SIZE_IH                      ((2*256+1)*4)
-
-enum cz_scratch_entry {
-	CZ_SCRATCH_ENTRY_UCODE_ID_SDMA0 = 0,
-	CZ_SCRATCH_ENTRY_UCODE_ID_SDMA1,
-	CZ_SCRATCH_ENTRY_UCODE_ID_CP_CE,
-	CZ_SCRATCH_ENTRY_UCODE_ID_CP_PFP,
-	CZ_SCRATCH_ENTRY_UCODE_ID_CP_ME,
-	CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT1,
-	CZ_SCRATCH_ENTRY_UCODE_ID_CP_MEC_JT2,
-	CZ_SCRATCH_ENTRY_UCODE_ID_GMCON_RENG,
-	CZ_SCRATCH_ENTRY_UCODE_ID_RLC_G,
-	CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SCRATCH,
-	CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_ARAM,
-	CZ_SCRATCH_ENTRY_UCODE_ID_RLC_SRM_DRAM,
-	CZ_SCRATCH_ENTRY_UCODE_ID_DMCU_ERAM,
-	CZ_SCRATCH_ENTRY_UCODE_ID_DMCU_IRAM,
-	CZ_SCRATCH_ENTRY_UCODE_ID_POWER_PROFILING,
-	CZ_SCRATCH_ENTRY_DATA_ID_SDMA_HALT,
-	CZ_SCRATCH_ENTRY_DATA_ID_SYS_CLOCKGATING,
-	CZ_SCRATCH_ENTRY_DATA_ID_SDMA_RING_REGS,
-	CZ_SCRATCH_ENTRY_DATA_ID_NONGFX_REINIT,
-	CZ_SCRATCH_ENTRY_DATA_ID_SDMA_START,
-	CZ_SCRATCH_ENTRY_DATA_ID_IH_REGISTERS,
-	CZ_SCRATCH_ENTRY_SMU8_FUSION_CLKTABLE
-};
-
-struct cz_buffer_entry {
-	uint32_t	data_size;
-	uint32_t	mc_addr_low;
-	uint32_t	mc_addr_high;
-	void		*kaddr;
-	enum cz_scratch_entry firmware_ID;
-};
-
-struct cz_register_index_data_pair {
-	uint32_t	offset;
-	uint32_t	value;
-};
-
-struct cz_ih_meta_data {
-	uint32_t	command;
-	struct cz_register_index_data_pair register_index_value_pair[1];
-};
-
-struct cz_smu_private_data {
-	uint8_t		driver_buffer_length;
-	uint8_t		scratch_buffer_length;
-	uint16_t	toc_entry_used_count;
-	uint16_t	toc_entry_initialize_index;
-	uint16_t	toc_entry_power_profiling_index;
-	uint16_t	toc_entry_aram;
-	uint16_t	toc_entry_ih_register_restore_task_index;
-	uint16_t	toc_entry_clock_table;
-	uint16_t	ih_register_restore_task_size;
-	uint16_t	smu_buffer_used_bytes;
-
-	struct cz_buffer_entry  toc_buffer;
-	struct cz_buffer_entry  smu_buffer;
-	struct cz_buffer_entry  driver_buffer[MAX_NUM_FIRMWARE];
-	struct cz_buffer_entry  scratch_buffer[MAX_NUM_SCRATCH];
-};
-
-#endif
--- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
@ -2072,7 +2072,7 @@ static int dce_v10_0_crtc_do_set_base(struct drm_crtc *crtc,

 	pipe_config = AMDGPU_TILING_GET(tiling_flags, PIPE_CONFIG);

-	switch (target_fb->pixel_format) {
+	switch (target_fb->format->format) {
 	case DRM_FORMAT_C8:
 		fb_format = REG_SET_FIELD(0, GRPH_CONTROL, GRPH_DEPTH, 0);
 		fb_format = REG_SET_FIELD(fb_format, GRPH_CONTROL, GRPH_FORMAT, 0);
@ -2145,7 +2145,7 @@ static int dce_v10_0_crtc_do_set_base(struct drm_crtc *crtc,
 		break;
 	default:
 		DRM_ERROR("Unsupported screen format %s\n",
-		          drm_get_format_name(target_fb->pixel_format, &format_name));
+		          drm_get_format_name(target_fb->format->format, &format_name));
 		return -EINVAL;
 	}

@ -2220,7 +2220,7 @@ static int dce_v10_0_crtc_do_set_base(struct drm_crtc *crtc,
 	WREG32(mmGRPH_X_END + amdgpu_crtc->crtc_offset, target_fb->width);
 	WREG32(mmGRPH_Y_END + amdgpu_crtc->crtc_offset, target_fb->height);

-	fb_pitch_pixels = target_fb->pitches[0] / (target_fb->bits_per_pixel / 8);
+	fb_pitch_pixels = target_fb->pitches[0] / target_fb->format->cpp[0];
 	WREG32(mmGRPH_PITCH + amdgpu_crtc->crtc_offset, fb_pitch_pixels);

 	dce_v10_0_grph_enable(crtc, true);
--- a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
@ -2056,7 +2056,7 @@ static int dce_v11_0_crtc_do_set_base(struct drm_crtc *crtc,

 	pipe_config = AMDGPU_TILING_GET(tiling_flags, PIPE_CONFIG);

-	switch (target_fb->pixel_format) {
+	switch (target_fb->format->format) {
 	case DRM_FORMAT_C8:
 		fb_format = REG_SET_FIELD(0, GRPH_CONTROL, GRPH_DEPTH, 0);
 		fb_format = REG_SET_FIELD(fb_format, GRPH_CONTROL, GRPH_FORMAT, 0);
@ -2129,7 +2129,7 @@ static int dce_v11_0_crtc_do_set_base(struct drm_crtc *crtc,
 		break;
 	default:
 		DRM_ERROR("Unsupported screen format %s\n",
-		          drm_get_format_name(target_fb->pixel_format, &format_name));
+		          drm_get_format_name(target_fb->format->format, &format_name));
 		return -EINVAL;
 	}

@ -2204,7 +2204,7 @@ static int dce_v11_0_crtc_do_set_base(struct drm_crtc *crtc,
 	WREG32(mmGRPH_X_END + amdgpu_crtc->crtc_offset, target_fb->width);
 	WREG32(mmGRPH_Y_END + amdgpu_crtc->crtc_offset, target_fb->height);

-	fb_pitch_pixels = target_fb->pitches[0] / (target_fb->bits_per_pixel / 8);
+	fb_pitch_pixels = target_fb->pitches[0] / target_fb->format->cpp[0];
 	WREG32(mmGRPH_PITCH + amdgpu_crtc->crtc_offset, fb_pitch_pixels);

 	dce_v11_0_grph_enable(crtc, true);
@ -3737,9 +3737,15 @@ static void dce_v11_0_encoder_add(struct amdgpu_device *adev,
 	default:
 		encoder->possible_crtcs = 0x3;
 		break;
+	case 3:
+		encoder->possible_crtcs = 0x7;
+		break;
 	case 4:
 		encoder->possible_crtcs = 0xf;
 		break;
+	case 5:
+		encoder->possible_crtcs = 0x1f;
+		break;
 	case 6:
 		encoder->possible_crtcs = 0x3f;
 		break;
--- a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
@ -1501,7 +1501,7 @@ static int dce_v6_0_crtc_do_set_base(struct drm_crtc *crtc,
 	amdgpu_bo_get_tiling_flags(abo, &tiling_flags);
 	amdgpu_bo_unreserve(abo);

-	switch (target_fb->pixel_format) {
+	switch (target_fb->format->format) {
 	case DRM_FORMAT_C8:
 		fb_format = (GRPH_DEPTH(GRPH_DEPTH_8BPP) |
 			     GRPH_FORMAT(GRPH_FORMAT_INDEXED));
@ -1567,7 +1567,7 @@ static int dce_v6_0_crtc_do_set_base(struct drm_crtc *crtc,
 		break;
 	default:
 		DRM_ERROR("Unsupported screen format %s\n",
-		          drm_get_format_name(target_fb->pixel_format, &format_name));
+		          drm_get_format_name(target_fb->format->format, &format_name));
 		return -EINVAL;
 	}

@ -1630,7 +1630,7 @@ static int dce_v6_0_crtc_do_set_base(struct drm_crtc *crtc,
 	WREG32(mmGRPH_X_END + amdgpu_crtc->crtc_offset, target_fb->width);
 	WREG32(mmGRPH_Y_END + amdgpu_crtc->crtc_offset, target_fb->height);

-	fb_pitch_pixels = target_fb->pitches[0] / (target_fb->bits_per_pixel / 8);
+	fb_pitch_pixels = target_fb->pitches[0] / target_fb->format->cpp[0];
 	WREG32(mmGRPH_PITCH + amdgpu_crtc->crtc_offset, fb_pitch_pixels);

 	dce_v6_0_grph_enable(crtc, true);
--- a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
@ -1950,7 +1950,7 @@ static int dce_v8_0_crtc_do_set_base(struct drm_crtc *crtc,

 	pipe_config = AMDGPU_TILING_GET(tiling_flags, PIPE_CONFIG);

-	switch (target_fb->pixel_format) {
+	switch (target_fb->format->format) {
 	case DRM_FORMAT_C8:
 		fb_format = ((GRPH_DEPTH_8BPP << GRPH_CONTROL__GRPH_DEPTH__SHIFT) |
 			     (GRPH_FORMAT_INDEXED << GRPH_CONTROL__GRPH_FORMAT__SHIFT));
@ -2016,7 +2016,7 @@ static int dce_v8_0_crtc_do_set_base(struct drm_crtc *crtc,
 		break;
 	default:
 		DRM_ERROR("Unsupported screen format %s\n",
-		          drm_get_format_name(target_fb->pixel_format, &format_name));
+		          drm_get_format_name(target_fb->format->format, &format_name));
 		return -EINVAL;
 	}

@ -2079,7 +2079,7 @@ static int dce_v8_0_crtc_do_set_base(struct drm_crtc *crtc,
 	WREG32(mmGRPH_X_END + amdgpu_crtc->crtc_offset, target_fb->width);
 	WREG32(mmGRPH_Y_END + amdgpu_crtc->crtc_offset, target_fb->height);

-	fb_pitch_pixels = target_fb->pitches[0] / (target_fb->bits_per_pixel / 8);
+	fb_pitch_pixels = target_fb->pitches[0] / target_fb->format->cpp[0];
 	WREG32(mmGRPH_PITCH + amdgpu_crtc->crtc_offset, fb_pitch_pixels);

 	dce_v8_0_grph_enable(crtc, true);
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@ -25,7 +25,7 @@
 #include "amdgpu_ih.h"
 #include "amdgpu_gfx.h"
 #include "amdgpu_ucode.h"
-#include "si/clearstate_si.h"
+#include "clearstate_si.h"
 #include "bif/bif_3_0_d.h"
 #include "bif/bif_3_0_sh_mask.h"
 #include "oss/oss_1_0_d.h"
@ -1325,21 +1325,19 @@ static u32 gfx_v6_0_create_bitmask(u32 bit_width)
 	return (u32)(((u64)1 << bit_width) - 1);
 }

-static u32 gfx_v6_0_get_rb_disabled(struct amdgpu_device *adev,
-				    u32 max_rb_num_per_se,
-				    u32 sh_per_se)
+static u32 gfx_v6_0_get_rb_active_bitmap(struct amdgpu_device *adev)
 {
 	u32 data, mask;

-	data = RREG32(mmCC_RB_BACKEND_DISABLE);
-	data &= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE_MASK;
-	data |= RREG32(mmGC_USER_RB_BACKEND_DISABLE);
+	data = RREG32(mmCC_RB_BACKEND_DISABLE) |
+		RREG32(mmGC_USER_RB_BACKEND_DISABLE);

-	data >>= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE__SHIFT;
+	data = REG_GET_FIELD(data, GC_USER_RB_BACKEND_DISABLE, BACKEND_DISABLE);

-	mask = gfx_v6_0_create_bitmask(max_rb_num_per_se / sh_per_se);
+	mask = gfx_v6_0_create_bitmask(adev->gfx.config.max_backends_per_se/
+					adev->gfx.config.max_sh_per_se);

-	return data & mask;
+	return ~data & mask;
 }

 static void gfx_v6_0_raster_config(struct amdgpu_device *adev, u32 *rconf)
@ -1468,68 +1466,55 @@ static void gfx_v6_0_write_harvested_raster_configs(struct amdgpu_device *adev,
 	gfx_v6_0_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff);
 }

-static void gfx_v6_0_setup_rb(struct amdgpu_device *adev,
-			      u32 se_num, u32 sh_per_se,
-			      u32 max_rb_num_per_se)
+static void gfx_v6_0_setup_rb(struct amdgpu_device *adev)
 {
 	int i, j;
-	u32 data, mask;
-	u32 disabled_rbs = 0;
-	u32 enabled_rbs = 0;
+	u32 data;
+	u32 raster_config = 0;
+	u32 active_rbs = 0;
+	u32 rb_bitmap_width_per_sh = adev->gfx.config.max_backends_per_se /
+					adev->gfx.config.max_sh_per_se;
 	unsigned num_rb_pipes;

 	mutex_lock(&adev->grbm_idx_mutex);
-	for (i = 0; i < se_num; i++) {
-		for (j = 0; j < sh_per_se; j++) {
+	for (i = 0; i < adev->gfx.config.max_shader_engines; i++) {
+		for (j = 0; j < adev->gfx.config.max_sh_per_se; j++) {
 			gfx_v6_0_select_se_sh(adev, i, j, 0xffffffff);
-			data = gfx_v6_0_get_rb_disabled(adev, max_rb_num_per_se, sh_per_se);
-			disabled_rbs |= data << ((i * sh_per_se + j) * 2);
+			data = gfx_v6_0_get_rb_active_bitmap(adev);
+			active_rbs |= data << ((i * adev->gfx.config.max_sh_per_se + j) *
+					rb_bitmap_width_per_sh);
 		}
 	}
 	gfx_v6_0_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff);
-	mutex_unlock(&adev->grbm_idx_mutex);

-	mask = 1;
-	for (i = 0; i < max_rb_num_per_se * se_num; i++) {
-		if (!(disabled_rbs & mask))
-			enabled_rbs |= mask;
-		mask <<= 1;
-	}
-
-	adev->gfx.config.backend_enable_mask = enabled_rbs;
-	adev->gfx.config.num_rbs = hweight32(enabled_rbs);
+	adev->gfx.config.backend_enable_mask = active_rbs;
+	adev->gfx.config.num_rbs = hweight32(active_rbs);

 	num_rb_pipes = min_t(unsigned, adev->gfx.config.max_backends_per_se *
 			     adev->gfx.config.max_shader_engines, 16);

-	mutex_lock(&adev->grbm_idx_mutex);
-	for (i = 0; i < se_num; i++) {
-		gfx_v6_0_select_se_sh(adev, i, 0xffffffff, 0xffffffff);
-		data = 0;
-		for (j = 0; j < sh_per_se; j++) {
-			switch (enabled_rbs & 3) {
-			case 1:
-				data |= (RASTER_CONFIG_RB_MAP_0 << (i * sh_per_se + j) * 2);
-				break;
-			case 2:
-				data |= (RASTER_CONFIG_RB_MAP_3 << (i * sh_per_se + j) * 2);
-				break;
-			case 3:
-			default:
-				data |= (RASTER_CONFIG_RB_MAP_2 << (i * sh_per_se + j) * 2);
-				break;
-			}
-			enabled_rbs >>= 2;
-		}
-		gfx_v6_0_raster_config(adev, &data);
+	gfx_v6_0_raster_config(adev, &raster_config);

-		if (!adev->gfx.config.backend_enable_mask ||
-				adev->gfx.config.num_rbs >= num_rb_pipes)
-			WREG32(mmPA_SC_RASTER_CONFIG, data);
-		else
-			gfx_v6_0_write_harvested_raster_configs(adev, data,
-								adev->gfx.config.backend_enable_mask,
-								num_rb_pipes);
+	if (!adev->gfx.config.backend_enable_mask ||
+			adev->gfx.config.num_rbs >= num_rb_pipes) {
+		WREG32(mmPA_SC_RASTER_CONFIG, raster_config);
+	} else {
+		gfx_v6_0_write_harvested_raster_configs(adev, raster_config,
+							adev->gfx.config.backend_enable_mask,
+							num_rb_pipes);
+	}
+
+	/* cache the values for userspace */
+	for (i = 0; i < adev->gfx.config.max_shader_engines; i++) {
+		for (j = 0; j < adev->gfx.config.max_sh_per_se; j++) {
+			gfx_v6_0_select_se_sh(adev, i, j, 0xffffffff);
+			adev->gfx.config.rb_config[i][j].rb_backend_disable =
+				RREG32(mmCC_RB_BACKEND_DISABLE);
+			adev->gfx.config.rb_config[i][j].user_rb_backend_disable =
+				RREG32(mmGC_USER_RB_BACKEND_DISABLE);
+			adev->gfx.config.rb_config[i][j].raster_config =
+				RREG32(mmPA_SC_RASTER_CONFIG);
+		}
 	}
 	gfx_v6_0_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff);
 	mutex_unlock(&adev->grbm_idx_mutex);
@ -1540,36 +1525,44 @@ static void gmc_v6_0_init_compute_vmid(struct amdgpu_device *adev)
 }
 */

-static u32 gfx_v6_0_get_cu_enabled(struct amdgpu_device *adev, u32 cu_per_sh)
+static void gfx_v6_0_set_user_cu_inactive_bitmap(struct amdgpu_device *adev,
+						 u32 bitmap)
+{
+	u32 data;
+
+	if (!bitmap)
+		return;
+
+	data = bitmap << GC_USER_SHADER_ARRAY_CONFIG__INACTIVE_CUS__SHIFT;
+	data &= GC_USER_SHADER_ARRAY_CONFIG__INACTIVE_CUS_MASK;
+
+	WREG32(mmGC_USER_SHADER_ARRAY_CONFIG, data);
+}
+
+static u32 gfx_v6_0_get_cu_enabled(struct amdgpu_device *adev)
 {
 	u32 data, mask;

-	data = RREG32(mmCC_GC_SHADER_ARRAY_CONFIG);
-	data &= CC_GC_SHADER_ARRAY_CONFIG__INACTIVE_CUS_MASK;
-	data |= RREG32(mmGC_USER_SHADER_ARRAY_CONFIG);
+	data = RREG32(mmCC_GC_SHADER_ARRAY_CONFIG) |
+		RREG32(mmGC_USER_SHADER_ARRAY_CONFIG);

-	data >>= CC_GC_SHADER_ARRAY_CONFIG__INACTIVE_CUS__SHIFT;
-
-	mask = gfx_v6_0_create_bitmask(cu_per_sh);
-
-	return ~data & mask;
+	mask = gfx_v6_0_create_bitmask(adev->gfx.config.max_cu_per_sh);
+	return ~REG_GET_FIELD(data, CC_GC_SHADER_ARRAY_CONFIG, INACTIVE_CUS) & mask;
 }


-static void gfx_v6_0_setup_spi(struct amdgpu_device *adev,
-			 u32 se_num, u32 sh_per_se,
-			 u32 cu_per_sh)
+static void gfx_v6_0_setup_spi(struct amdgpu_device *adev)
 {
 	int i, j, k;
 	u32 data, mask;
 	u32 active_cu = 0;

 	mutex_lock(&adev->grbm_idx_mutex);
-	for (i = 0; i < se_num; i++) {
-		for (j = 0; j < sh_per_se; j++) {
+	for (i = 0; i < adev->gfx.config.max_shader_engines; i++) {
+		for (j = 0; j < adev->gfx.config.max_sh_per_se; j++) {
 			gfx_v6_0_select_se_sh(adev, i, j, 0xffffffff);
 			data = RREG32(mmSPI_STATIC_THREAD_MGMT_3);
-			active_cu = gfx_v6_0_get_cu_enabled(adev, cu_per_sh);
+			active_cu = gfx_v6_0_get_cu_enabled(adev);

 			mask = 1;
 			for (k = 0; k < 16; k++) {
@ -1717,6 +1710,9 @@ static void gfx_v6_0_gpu_init(struct amdgpu_device *adev)
 		gb_addr_config |= 2 << GB_ADDR_CONFIG__ROW_SIZE__SHIFT;
 		break;
 	}
+	gb_addr_config &= ~GB_ADDR_CONFIG__NUM_SHADER_ENGINES_MASK;
+	if (adev->gfx.config.max_shader_engines == 2)
+		gb_addr_config |= 1 << GB_ADDR_CONFIG__NUM_SHADER_ENGINES__SHIFT;
 	adev->gfx.config.gb_addr_config = gb_addr_config;

 	WREG32(mmGB_ADDR_CONFIG, gb_addr_config);
@ -1735,13 +1731,9 @@ static void gfx_v6_0_gpu_init(struct amdgpu_device *adev)
 #endif
 	gfx_v6_0_tiling_mode_table_init(adev);

-	gfx_v6_0_setup_rb(adev, adev->gfx.config.max_shader_engines,
-		    adev->gfx.config.max_sh_per_se,
-		    adev->gfx.config.max_backends_per_se);
+	gfx_v6_0_setup_rb(adev);

-	gfx_v6_0_setup_spi(adev, adev->gfx.config.max_shader_engines,
-		     adev->gfx.config.max_sh_per_se,
-		     adev->gfx.config.max_cu_per_sh);
+	gfx_v6_0_setup_spi(adev);

 	gfx_v6_0_get_cu_info(adev);

@ -1794,14 +1786,9 @@ static void gfx_v6_0_gpu_init(struct amdgpu_device *adev)

 static void gfx_v6_0_scratch_init(struct amdgpu_device *adev)
 {
-	int i;
-
 	adev->gfx.scratch.num_reg = 7;
 	adev->gfx.scratch.reg_base = mmSCRATCH_REG0;
-	for (i = 0; i < adev->gfx.scratch.num_reg; i++) {
-		adev->gfx.scratch.free[i] = true;
-		adev->gfx.scratch.reg[i] = adev->gfx.scratch.reg_base + i;
-	}
+	adev->gfx.scratch.free_mask = (1u << adev->gfx.scratch.num_reg) - 1;
 }

 static int gfx_v6_0_ring_test_ring(struct amdgpu_ring *ring)
@ -1975,7 +1962,7 @@ static int gfx_v6_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	ib.ptr[2] = 0xDEADBEEF;
 	ib.length_dw = 3;

-	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, NULL, &f);
+	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, &f);
 	if (r)
 		goto err2;

@ -2946,61 +2933,16 @@ static void gfx_v6_0_enable_gfx_cgpg(struct amdgpu_device *adev,
 	}
 }

-static u32 gfx_v6_0_get_cu_active_bitmap(struct amdgpu_device *adev,
-					 u32 se, u32 sh)
-{
-
-	u32 mask = 0, tmp, tmp1;
-	int i;
-
-	mutex_lock(&adev->grbm_idx_mutex);
-	gfx_v6_0_select_se_sh(adev, se, sh, 0xffffffff);
-	tmp = RREG32(mmCC_GC_SHADER_ARRAY_CONFIG);
-	tmp1 = RREG32(mmGC_USER_SHADER_ARRAY_CONFIG);
-	gfx_v6_0_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff);
-	mutex_unlock(&adev->grbm_idx_mutex);
-
-	tmp &= 0xffff0000;
-
-	tmp |= tmp1;
-	tmp >>= 16;
-
-	for (i = 0; i < adev->gfx.config.max_cu_per_sh; i ++) {
-		mask <<= 1;
-		mask |= 1;
-	}
-
-	return (~tmp) & mask;
-}
-
 static void gfx_v6_0_init_ao_cu_mask(struct amdgpu_device *adev)
 {
-	u32 i, j, k, active_cu_number = 0;
+	u32 tmp;

-	u32 mask, counter, cu_bitmap;
-	u32 tmp = 0;
+	WREG32(mmRLC_PG_ALWAYS_ON_CU_MASK, adev->gfx.cu_info.ao_cu_mask);

-	for (i = 0; i < adev->gfx.config.max_shader_engines; i++) {
-		for (j = 0; j < adev->gfx.config.max_sh_per_se; j++) {
-			mask = 1;
-			cu_bitmap = 0;
-			counter  = 0;
-			for (k = 0; k < adev->gfx.config.max_cu_per_sh; k++) {
-				if (gfx_v6_0_get_cu_active_bitmap(adev, i, j) & mask) {
-					if (counter < 2)
-						cu_bitmap |= mask;
-					counter++;
-				}
-				mask <<= 1;
-			}
-
-			active_cu_number += counter;
-			tmp |= (cu_bitmap << (i * 16 + j * 8));
-		}
-	}
-
-	WREG32(mmRLC_PG_AO_CU_MASK, tmp);
-	WREG32_FIELD(RLC_MAX_PG_CU, MAX_POWERED_UP_CU, active_cu_number);
+	tmp = RREG32(mmRLC_MAX_PG_CU);
+	tmp &= ~RLC_MAX_PG_CU__MAX_POWERED_UP_CU_MASK;
+	tmp |= (adev->gfx.cu_info.number << RLC_MAX_PG_CU__MAX_POWERED_UP_CU__SHIFT);
+	WREG32(mmRLC_MAX_PG_CU, tmp);
 }

 static void gfx_v6_0_enable_gfx_static_mgpg(struct amdgpu_device *adev,
@ -3775,18 +3717,26 @@ static void gfx_v6_0_get_cu_info(struct amdgpu_device *adev)
 	int i, j, k, counter, active_cu_number = 0;
 	u32 mask, bitmap, ao_bitmap, ao_cu_mask = 0;
 	struct amdgpu_cu_info *cu_info = &adev->gfx.cu_info;
+	unsigned disable_masks[4 * 2];

 	memset(cu_info, 0, sizeof(*cu_info));

+	amdgpu_gfx_parse_disable_cu(disable_masks, 4, 2);
+
+	mutex_lock(&adev->grbm_idx_mutex);
 	for (i = 0; i < adev->gfx.config.max_shader_engines; i++) {
 		for (j = 0; j < adev->gfx.config.max_sh_per_se; j++) {
 			mask = 1;
 			ao_bitmap = 0;
 			counter = 0;
-			bitmap = gfx_v6_0_get_cu_active_bitmap(adev, i, j);
+			gfx_v6_0_select_se_sh(adev, i, j, 0xffffffff);
+			if (i < 4 && j < 2)
+				gfx_v6_0_set_user_cu_inactive_bitmap(
+					adev, disable_masks[i * 2 + j]);
+			bitmap = gfx_v6_0_get_cu_enabled(adev);
 			cu_info->bitmap[i][j] = bitmap;

-			for (k = 0; k < adev->gfx.config.max_cu_per_sh; k ++) {
+			for (k = 0; k < 16; k++) {
 				if (bitmap & mask) {
 					if (counter < 2)
 						ao_bitmap |= mask;
@ -3799,6 +3749,9 @@ static void gfx_v6_0_get_cu_info(struct amdgpu_device *adev)
 		}
 	}

+	gfx_v6_0_select_se_sh(adev, 0xffffffff, 0xffffffff, 0xffffffff);
+	mutex_unlock(&adev->grbm_idx_mutex);
+
 	cu_info->number = active_cu_number;
 	cu_info->ao_cu_mask = ao_cu_mask;
 }
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@ -1983,6 +1983,14 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device *adev)
 	WREG32(mmPA_CL_ENHANCE, PA_CL_ENHANCE__CLIP_VTX_REORDER_ENA_MASK |
 			(3 << PA_CL_ENHANCE__NUM_CLIP_SEQ__SHIFT));
 	WREG32(mmPA_SC_ENHANCE, PA_SC_ENHANCE__ENABLE_PA_SC_OUT_OF_ORDER_MASK);
+
+	tmp = RREG32(mmSPI_ARB_PRIORITY);
+	tmp = REG_SET_FIELD(tmp, SPI_ARB_PRIORITY, PIPE_ORDER_TS0, 2);
+	tmp = REG_SET_FIELD(tmp, SPI_ARB_PRIORITY, PIPE_ORDER_TS1, 2);
+	tmp = REG_SET_FIELD(tmp, SPI_ARB_PRIORITY, PIPE_ORDER_TS2, 2);
+	tmp = REG_SET_FIELD(tmp, SPI_ARB_PRIORITY, PIPE_ORDER_TS3, 2);
+	WREG32(mmSPI_ARB_PRIORITY, tmp);
+
 	mutex_unlock(&adev->grbm_idx_mutex);

 	udelay(50);
@ -2003,14 +2011,9 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device *adev)
 */
 static void gfx_v7_0_scratch_init(struct amdgpu_device *adev)
 {
-	int i;
-
 	adev->gfx.scratch.num_reg = 7;
 	adev->gfx.scratch.reg_base = mmSCRATCH_REG0;
-	for (i = 0; i < adev->gfx.scratch.num_reg; i++) {
-		adev->gfx.scratch.free[i] = true;
-		adev->gfx.scratch.reg[i] = adev->gfx.scratch.reg_base + i;
-	}
+	adev->gfx.scratch.free_mask = (1u << adev->gfx.scratch.num_reg) - 1;
 }

 /**
@ -2321,7 +2324,7 @@ static int gfx_v7_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	ib.ptr[2] = 0xDEADBEEF;
 	ib.length_dw = 3;

-	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, NULL, &f);
+	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, &f);
 	if (r)
 		goto err2;

--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@ -375,9 +375,16 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
 	/* size in MB on si */
 	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
-	adev->mc.visible_vram_size = adev->mc.aper_size;
+
+#ifdef CONFIG_X86_64
+	if (adev->flags & AMD_IS_APU) {
+		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
+		adev->mc.aper_size = adev->mc.real_vram_size;
+	}
+#endif

 	/* In case the PCI BAR is larger than the actual amount of vram */
+	adev->mc.visible_vram_size = adev->mc.aper_size;
 	if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
 		adev->mc.visible_vram_size = adev->mc.real_vram_size;

--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@ -467,9 +467,16 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
 	/* size in MB on si */
 	adev->mc.mc_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
-	adev->mc.visible_vram_size = adev->mc.aper_size;
+
+#ifdef CONFIG_X86_64
+	if (adev->flags & AMD_IS_APU) {
+		adev->mc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22;
+		adev->mc.aper_size = adev->mc.real_vram_size;
+	}
+#endif

 	/* In case the PCI BAR is larger than the actual amount of vram */
+	adev->mc.visible_vram_size = adev->mc.aper_size;
 	if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
 		adev->mc.visible_vram_size = adev->mc.real_vram_size;

@ -1439,6 +1446,21 @@ static int gmc_v8_0_set_powergating_state(void *handle,
 	return 0;
 }

+static void gmc_v8_0_get_clockgating_state(void *handle, u32 *flags)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+	int data;
+
+	/* AMD_CG_SUPPORT_MC_MGCG */
+	data = RREG32(mmMC_HUB_MISC_HUB_CG);
+	if (data & MC_HUB_MISC_HUB_CG__ENABLE_MASK)
+		*flags |= AMD_CG_SUPPORT_MC_MGCG;
+
+	/* AMD_CG_SUPPORT_MC_LS */
+	if (data & MC_HUB_MISC_HUB_CG__MEM_LS_ENABLE_MASK)
+		*flags |= AMD_CG_SUPPORT_MC_LS;
+}
+
 static const struct amd_ip_funcs gmc_v8_0_ip_funcs = {
 	.name = "gmc_v8_0",
 	.early_init = gmc_v8_0_early_init,
@ -1457,6 +1479,7 @@ static const struct amd_ip_funcs gmc_v8_0_ip_funcs = {
 	.post_soft_reset = gmc_v8_0_post_soft_reset,
 	.set_clockgating_state = gmc_v8_0_set_clockgating_state,
 	.set_powergating_state = gmc_v8_0_set_powergating_state,
+	.get_clockgating_state = gmc_v8_0_get_clockgating_state,
 };

 static const struct amdgpu_gart_funcs gmc_v8_0_gart_funcs = {
--- a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
@ -1230,6 +1230,7 @@ static void kv_update_current_ps(struct amdgpu_device *adev,
 	pi->current_rps = *rps;
 	pi->current_ps = *new_ps;
 	pi->current_rps.ps_priv = &pi->current_ps;
+	adev->pm.dpm.current_ps = &pi->current_rps;
 }

 static void kv_update_requested_ps(struct amdgpu_device *adev,
@ -1241,6 +1242,7 @@ static void kv_update_requested_ps(struct amdgpu_device *adev,
 	pi->requested_rps = *rps;
 	pi->requested_ps = *new_ps;
 	pi->requested_rps.ps_priv = &pi->requested_ps;
+	adev->pm.dpm.requested_ps = &pi->requested_rps;
 }

 static void kv_dpm_enable_bapm(struct amdgpu_device *adev, bool enable)
@ -1548,11 +1550,6 @@ static int kv_update_vce_dpm(struct amdgpu_device *adev,

 	if (amdgpu_new_state->evclk > 0 && amdgpu_current_state->evclk == 0) {
 		kv_dpm_powergate_vce(adev, false);
-		/* turn the clocks on when encoding */
-		ret = amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
-						    AMD_CG_STATE_UNGATE);
-		if (ret)
-			return ret;
 		if (pi->caps_stable_p_state)
 			pi->vce_boot_level = table->count - 1;
 		else
@ -1571,15 +1568,9 @@ static int kv_update_vce_dpm(struct amdgpu_device *adev,
 			amdgpu_kv_send_msg_to_smc_with_parameter(adev,
 							  PPSMC_MSG_VCEDPM_SetEnabledMask,
 							  (1 << pi->vce_boot_level));
-
 		kv_enable_vce_dpm(adev, true);
 	} else if (amdgpu_new_state->evclk == 0 && amdgpu_current_state->evclk > 0) {
 		kv_enable_vce_dpm(adev, false);
-		/* turn the clocks off when not encoding */
-		ret = amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
-						    AMD_CG_STATE_GATE);
-		if (ret)
-			return ret;
 		kv_dpm_powergate_vce(adev, true);
 	}

@ -1686,70 +1677,44 @@ static void kv_dpm_powergate_uvd(struct amdgpu_device *adev, bool gate)
 	struct kv_power_info *pi = kv_get_pi(adev);
 	int ret;

-	if (pi->uvd_power_gated == gate)
-		return;
-
 	pi->uvd_power_gated = gate;

 	if (gate) {
-		if (pi->caps_uvd_pg) {
-			/* disable clockgating so we can properly shut down the block */
-			ret = amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
-							    AMD_CG_STATE_UNGATE);
-			/* shutdown the UVD block */
-			ret = amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
-							    AMD_PG_STATE_GATE);
-			/* XXX: check for errors */
-		}
+		/* stop the UVD block */
+		ret = amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
+							AMD_PG_STATE_GATE);
 		kv_update_uvd_dpm(adev, gate);
 		if (pi->caps_uvd_pg)
 			/* power off the UVD block */
 			amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_UVDPowerOFF);
 	} else {
-		if (pi->caps_uvd_pg) {
+		if (pi->caps_uvd_pg)
 			/* power on the UVD block */
 			amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_UVDPowerON);
 			/* re-init the UVD block */
-			ret = amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
-							    AMD_PG_STATE_UNGATE);
-			/* enable clockgating. hw will dynamically gate/ungate clocks on the fly */
-			ret = amdgpu_set_clockgating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
-							    AMD_CG_STATE_GATE);
-			/* XXX: check for errors */
-		}
 		kv_update_uvd_dpm(adev, gate);
+
+		ret = amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
+							AMD_PG_STATE_UNGATE);
 	}
 }

 static void kv_dpm_powergate_vce(struct amdgpu_device *adev, bool gate)
 {
 	struct kv_power_info *pi = kv_get_pi(adev);
-	int ret;

 	if (pi->vce_power_gated == gate)
 		return;

 	pi->vce_power_gated = gate;

-	if (gate) {
-		if (pi->caps_vce_pg) {
-			/* shutdown the VCE block */
-			ret = amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
-							    AMD_PG_STATE_GATE);
-			/* XXX: check for errors */
-			/* power off the VCE block */
-			amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_VCEPowerOFF);
-		}
-	} else {
-		if (pi->caps_vce_pg) {
-			/* power on the VCE block */
-			amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_VCEPowerON);
-			/* re-init the VCE block */
-			ret = amdgpu_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCE,
-							    AMD_PG_STATE_UNGATE);
-			/* XXX: check for errors */
-		}
-	}
+	if (!pi->caps_vce_pg)
+		return;
+
+	if (gate)
+		amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_VCEPowerOFF);
+	else
+		amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_VCEPowerON);
 }

 static void kv_dpm_powergate_samu(struct amdgpu_device *adev, bool gate)
@ -1904,19 +1869,19 @@ static int kv_enable_nb_dpm(struct amdgpu_device *adev,
 }

 static int kv_dpm_force_performance_level(struct amdgpu_device *adev,
-					  enum amdgpu_dpm_forced_level level)
+					  enum amd_dpm_forced_level level)
 {
 	int ret;

-	if (level == AMDGPU_DPM_FORCED_LEVEL_HIGH) {
+	if (level == AMD_DPM_FORCED_LEVEL_HIGH) {
 		ret = kv_force_dpm_highest(adev);
 		if (ret)
 			return ret;
-	} else if (level == AMDGPU_DPM_FORCED_LEVEL_LOW) {
+	} else if (level == AMD_DPM_FORCED_LEVEL_LOW) {
 		ret = kv_force_dpm_lowest(adev);
 		if (ret)
 			return ret;
-	} else if (level == AMDGPU_DPM_FORCED_LEVEL_AUTO) {
+	} else if (level == AMD_DPM_FORCED_LEVEL_AUTO) {
 		ret = kv_unforce_levels(adev);
 		if (ret)
 			return ret;
@ -3007,8 +2972,6 @@ static int kv_dpm_late_init(void *handle)

 	kv_dpm_powergate_acp(adev, true);
 	kv_dpm_powergate_samu(adev, true);
-	kv_dpm_powergate_vce(adev, true);
-	kv_dpm_powergate_uvd(adev, true);

 	return 0;
 }
@ -3029,7 +2992,7 @@ static int kv_dpm_sw_init(void *handle)
 	/* default to balanced state */
 	adev->pm.dpm.state = POWER_STATE_TYPE_BALANCED;
 	adev->pm.dpm.user_state = POWER_STATE_TYPE_BALANCED;
-	adev->pm.dpm.forced_level = AMDGPU_DPM_FORCED_LEVEL_AUTO;
+	adev->pm.dpm.forced_level = AMD_DPM_FORCED_LEVEL_AUTO;
 	adev->pm.default_sclk = adev->clock.default_sclk;
 	adev->pm.default_mclk = adev->clock.default_mclk;
 	adev->pm.current_sclk = adev->clock.default_sclk;
@ -3078,6 +3041,9 @@ static int kv_dpm_hw_init(void *handle)
 	int ret;
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+	if (!amdgpu_dpm)
+		return 0;
+
 	mutex_lock(&adev->pm.mutex);
 	kv_dpm_setup_asic(adev);
 	ret = kv_dpm_enable(adev);
@ -3245,15 +3211,52 @@ static int kv_dpm_set_powergating_state(void *handle,
 	return 0;
 }

+static inline bool kv_are_power_levels_equal(const struct kv_pl *kv_cpl1,
+						const struct kv_pl *kv_cpl2)
+{
+	return ((kv_cpl1->sclk == kv_cpl2->sclk) &&
+		  (kv_cpl1->vddc_index == kv_cpl2->vddc_index) &&
+		  (kv_cpl1->ds_divider_index == kv_cpl2->ds_divider_index) &&
+		  (kv_cpl1->force_nbp_state == kv_cpl2->force_nbp_state));
+}
+
 static int kv_check_state_equal(struct amdgpu_device *adev,
 				struct amdgpu_ps *cps,
 				struct amdgpu_ps *rps,
 				bool *equal)
 {
-	if (equal == NULL)
+	struct kv_ps *kv_cps;
+	struct kv_ps *kv_rps;
+	int i;
+
+	if (adev == NULL || cps == NULL || rps == NULL || equal == NULL)
 		return -EINVAL;

-	*equal = false;
+	kv_cps = kv_get_ps(cps);
+	kv_rps = kv_get_ps(rps);
+
+	if (kv_cps == NULL) {
+		*equal = false;
+		return 0;
+	}
+
+	if (kv_cps->num_levels != kv_rps->num_levels) {
+		*equal = false;
+		return 0;
+	}
+
+	for (i = 0; i < kv_cps->num_levels; i++) {
+		if (!kv_are_power_levels_equal(&(kv_cps->levels[i]),
+					&(kv_rps->levels[i]))) {
+			*equal = false;
+			return 0;
+		}
+	}
+
+	/* If all performance levels are the same try to use the UVD clocks to break the tie.*/
+	*equal = ((cps->vclk == rps->vclk) && (cps->dclk == rps->dclk));
+	*equal &= ((cps->evclk == rps->evclk) && (cps->ecclk == rps->ecclk));
+
 	return 0;
 }

@ -3307,12 +3310,3 @@ static void kv_dpm_set_irq_funcs(struct amdgpu_device *adev)
 	adev->pm.dpm.thermal.irq.num_types = AMDGPU_THERMAL_IRQ_LAST;
 	adev->pm.dpm.thermal.irq.funcs = &kv_dpm_irq_funcs;
 }
-
-const struct amdgpu_ip_block_version kv_dpm_ip_block =
-{
-	.type = AMD_IP_BLOCK_TYPE_SMC,
-	.major = 7,
-	.minor = 0,
-	.rev = 0,
-	.funcs = &kv_dpm_ip_funcs,
-};
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
@ -0,0 +1,592 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Xiangliang.Yu@amd.com
+ */
+
+#include "amdgpu.h"
+#include "vi.h"
+#include "bif/bif_5_0_d.h"
+#include "bif/bif_5_0_sh_mask.h"
+#include "vid.h"
+#include "gca/gfx_8_0_d.h"
+#include "gca/gfx_8_0_sh_mask.h"
+#include "gmc_v8_0.h"
+#include "gfx_v8_0.h"
+#include "sdma_v3_0.h"
+#include "tonga_ih.h"
+#include "gmc/gmc_8_2_d.h"
+#include "gmc/gmc_8_2_sh_mask.h"
+#include "oss/oss_3_0_d.h"
+#include "oss/oss_3_0_sh_mask.h"
+#include "gca/gfx_8_0_sh_mask.h"
+#include "dce/dce_10_0_d.h"
+#include "dce/dce_10_0_sh_mask.h"
+#include "smu/smu_7_1_3_d.h"
+#include "mxgpu_vi.h"
+
+/* VI golden setting */
+static const u32 xgpu_fiji_mgcg_cgcg_init[] = {
+	mmRLC_CGTT_MGCG_OVERRIDE, 0xffffffff, 0xffffffff,
+	mmGRBM_GFX_INDEX, 0xffffffff, 0xe0000000,
+	mmCB_CGTT_SCLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_BCI_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_CP_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_CPC_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_CPF_CLK_CTRL, 0xffffffff, 0x40000100,
+	mmCGTT_DRM_CLK_CTRL0, 0xffffffff, 0x00600100,
+	mmCGTT_GDS_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_IA_CLK_CTRL, 0xffffffff, 0x06000100,
+	mmCGTT_PA_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_WD_CLK_CTRL, 0xffffffff, 0x06000100,
+	mmCGTT_PC_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_RLC_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_SC_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_SPI_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_SQ_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_SQG_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL0, 0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL1, 0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL2, 0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL3, 0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL4, 0xffffffff, 0x00000100,
+	mmCGTT_TCI_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_TCP_CLK_CTRL, 0xffffffff, 0x00000100,
+	mmCGTT_VGT_CLK_CTRL, 0xffffffff, 0x06000100,
+	mmDB_CGTT_CLK_CTRL_0, 0xffffffff, 0x00000100,
+	mmTA_CGTT_CTRL, 0xffffffff, 0x00000100,
+	mmTCA_CGTT_SCLK_CTRL, 0xffffffff, 0x00000100,
+	mmTCC_CGTT_SCLK_CTRL, 0xffffffff, 0x00000100,
+	mmTD_CGTT_CTRL, 0xffffffff, 0x00000100,
+	mmGRBM_GFX_INDEX, 0xffffffff, 0xe0000000,
+	mmCGTS_SM_CTRL_REG, 0xffffffff, 0x96e00200,
+	mmCP_RB_WPTR_POLL_CNTL, 0xffffffff, 0x00900100,
+	mmRLC_CGCG_CGLS_CTRL, 0xffffffff, 0x0020003c,
+	mmPCIE_INDEX, 0xffffffff, 0x0140001c,
+	mmPCIE_DATA, 0x000f0000, 0x00000000,
+	mmSMC_IND_INDEX_4, 0xffffffff, 0xC060000C,
+	mmSMC_IND_DATA_4, 0xc0000fff, 0x00000100,
+	mmXDMA_CLOCK_GATING_CNTL, 0xffffffff, 0x00000100,
+	mmXDMA_MEM_POWER_CNTL, 0x00000101, 0x00000000,
+	mmMC_MEM_POWER_LS, 0xffffffff, 0x00000104,
+	mmCGTT_DRM_CLK_CTRL0, 0xff000fff, 0x00000100,
+	mmHDP_XDP_CGTT_BLK_CTRL, 0xc0000fff, 0x00000104,
+	mmCP_MEM_SLP_CNTL, 0x00000001, 0x00000001,
+	mmSDMA0_CLK_CTRL, 0xff000ff0, 0x00000100,
+	mmSDMA1_CLK_CTRL, 0xff000ff0, 0x00000100,
+};
+
+static const u32 xgpu_fiji_golden_settings_a10[] = {
+	mmCB_HW_CONTROL_3, 0x000001ff, 0x00000040,
+	mmDB_DEBUG2, 0xf00fffff, 0x00000400,
+	mmDCI_CLK_CNTL, 0x00000080, 0x00000000,
+	mmFBC_DEBUG_COMP, 0x000000f0, 0x00000070,
+	mmFBC_MISC, 0x1f311fff, 0x12300000,
+	mmHDMI_CONTROL, 0x31000111, 0x00000011,
+	mmPA_SC_ENHANCE, 0xffffffff, 0x20000001,
+	mmPA_SC_LINE_STIPPLE_STATE, 0x0000ff0f, 0x00000000,
+	mmSDMA0_CHICKEN_BITS, 0xfc910007, 0x00810007,
+	mmSDMA0_GFX_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA0_RLC0_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA0_RLC1_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA1_CHICKEN_BITS, 0xfc910007, 0x00810007,
+	mmSDMA1_GFX_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA1_RLC0_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA1_RLC1_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSQ_RANDOM_WAVE_PRI, 0x001fffff, 0x000006fd,
+	mmTA_CNTL_AUX, 0x000f000f, 0x000b0000,
+	mmTCC_EXE_DISABLE, 0x00000002, 0x00000002,
+	mmTCP_ADDR_CONFIG, 0x000003ff, 0x000000ff,
+	mmVGT_RESET_DEBUG, 0x00000004, 0x00000004,
+	mmVM_PRT_APERTURE0_LOW_ADDR, 0x0fffffff, 0x0fffffff,
+	mmVM_PRT_APERTURE1_LOW_ADDR, 0x0fffffff, 0x0fffffff,
+	mmVM_PRT_APERTURE2_LOW_ADDR, 0x0fffffff, 0x0fffffff,
+	mmVM_PRT_APERTURE3_LOW_ADDR, 0x0fffffff, 0x0fffffff,
+};
+
+static const u32 xgpu_fiji_golden_common_all[] = {
+	mmGRBM_GFX_INDEX, 0xffffffff, 0xe0000000,
+	mmPA_SC_RASTER_CONFIG, 0xffffffff, 0x3a00161a,
+	mmPA_SC_RASTER_CONFIG_1, 0xffffffff, 0x0000002e,
+	mmGB_ADDR_CONFIG, 0xffffffff, 0x22011003,
+	mmSPI_RESOURCE_RESERVE_CU_0, 0xffffffff, 0x00000800,
+	mmSPI_RESOURCE_RESERVE_CU_1, 0xffffffff, 0x00000800,
+	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00007FBF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00007FAF,
+	mmGRBM_GFX_INDEX, 0xffffffff, 0xe0000000,
+	mmSPI_CONFIG_CNTL_1, 0x0000000f, 0x00000009,
+};
+
+static const u32 xgpu_tonga_mgcg_cgcg_init[] = {
+	mmRLC_CGTT_MGCG_OVERRIDE,   0xffffffff, 0xffffffff,
+	mmGRBM_GFX_INDEX,           0xffffffff, 0xe0000000,
+	mmCB_CGTT_SCLK_CTRL,        0xffffffff, 0x00000100,
+	mmCGTT_BCI_CLK_CTRL,        0xffffffff, 0x00000100,
+	mmCGTT_CP_CLK_CTRL,         0xffffffff, 0x00000100,
+	mmCGTT_CPC_CLK_CTRL,        0xffffffff, 0x00000100,
+	mmCGTT_CPF_CLK_CTRL,        0xffffffff, 0x40000100,
+	mmCGTT_DRM_CLK_CTRL0,       0xffffffff, 0x00600100,
+	mmCGTT_GDS_CLK_CTRL,        0xffffffff, 0x00000100,
+	mmCGTT_IA_CLK_CTRL,         0xffffffff, 0x06000100,
+	mmCGTT_PA_CLK_CTRL,         0xffffffff, 0x00000100,
+	mmCGTT_WD_CLK_CTRL,         0xffffffff, 0x06000100,
+	mmCGTT_PC_CLK_CTRL,         0xffffffff, 0x00000100,
+	mmCGTT_RLC_CLK_CTRL,        0xffffffff, 0x00000100,
+	mmCGTT_SC_CLK_CTRL,         0xffffffff, 0x00000100,
+	mmCGTT_SPI_CLK_CTRL,        0xffffffff, 0x00000100,
+	mmCGTT_SQ_CLK_CTRL,         0xffffffff, 0x00000100,
+	mmCGTT_SQG_CLK_CTRL,        0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL0,        0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL1,        0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL2,        0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL3,        0xffffffff, 0x00000100,
+	mmCGTT_SX_CLK_CTRL4,        0xffffffff, 0x00000100,
+	mmCGTT_TCI_CLK_CTRL,        0xffffffff, 0x00000100,
+	mmCGTT_TCP_CLK_CTRL,        0xffffffff, 0x00000100,
+	mmCGTT_VGT_CLK_CTRL,        0xffffffff, 0x06000100,
+	mmDB_CGTT_CLK_CTRL_0,       0xffffffff, 0x00000100,
+	mmTA_CGTT_CTRL,             0xffffffff, 0x00000100,
+	mmTCA_CGTT_SCLK_CTRL,       0xffffffff, 0x00000100,
+	mmTCC_CGTT_SCLK_CTRL,       0xffffffff, 0x00000100,
+	mmTD_CGTT_CTRL,             0xffffffff, 0x00000100,
+	mmGRBM_GFX_INDEX,           0xffffffff, 0xe0000000,
+	mmCGTS_CU0_SP0_CTRL_REG,    0xffffffff, 0x00010000,
+	mmCGTS_CU0_LDS_SQ_CTRL_REG, 0xffffffff, 0x00030002,
+	mmCGTS_CU0_TA_SQC_CTRL_REG, 0xffffffff, 0x00040007,
+	mmCGTS_CU0_SP1_CTRL_REG,    0xffffffff, 0x00060005,
+	mmCGTS_CU0_TD_TCP_CTRL_REG, 0xffffffff, 0x00090008,
+	mmCGTS_CU1_SP0_CTRL_REG,    0xffffffff, 0x00010000,
+	mmCGTS_CU1_LDS_SQ_CTRL_REG, 0xffffffff, 0x00030002,
+	mmCGTS_CU1_TA_CTRL_REG,     0xffffffff, 0x00040007,
+	mmCGTS_CU1_SP1_CTRL_REG,    0xffffffff, 0x00060005,
+	mmCGTS_CU1_TD_TCP_CTRL_REG, 0xffffffff, 0x00090008,
+	mmCGTS_CU2_SP0_CTRL_REG,    0xffffffff, 0x00010000,
+	mmCGTS_CU2_LDS_SQ_CTRL_REG, 0xffffffff, 0x00030002,
+	mmCGTS_CU2_TA_CTRL_REG,     0xffffffff, 0x00040007,
+	mmCGTS_CU2_SP1_CTRL_REG,    0xffffffff, 0x00060005,
+	mmCGTS_CU2_TD_TCP_CTRL_REG, 0xffffffff, 0x00090008,
+	mmCGTS_CU3_SP0_CTRL_REG,    0xffffffff, 0x00010000,
+	mmCGTS_CU3_LDS_SQ_CTRL_REG, 0xffffffff, 0x00030002,
+	mmCGTS_CU3_TA_CTRL_REG,     0xffffffff, 0x00040007,
+	mmCGTS_CU3_SP1_CTRL_REG,    0xffffffff, 0x00060005,
+	mmCGTS_CU3_TD_TCP_CTRL_REG, 0xffffffff, 0x00090008,
+	mmCGTS_CU4_SP0_CTRL_REG,    0xffffffff, 0x00010000,
+	mmCGTS_CU4_LDS_SQ_CTRL_REG, 0xffffffff, 0x00030002,
+	mmCGTS_CU4_TA_SQC_CTRL_REG, 0xffffffff, 0x00040007,
+	mmCGTS_CU4_SP1_CTRL_REG,    0xffffffff, 0x00060005,
+	mmCGTS_CU4_TD_TCP_CTRL_REG, 0xffffffff, 0x00090008,
+	mmCGTS_CU5_SP0_CTRL_REG,    0xffffffff, 0x00010000,
+	mmCGTS_CU5_LDS_SQ_CTRL_REG, 0xffffffff, 0x00030002,
+	mmCGTS_CU5_TA_CTRL_REG,     0xffffffff, 0x00040007,
+	mmCGTS_CU5_SP1_CTRL_REG,    0xffffffff, 0x00060005,
+	mmCGTS_CU5_TD_TCP_CTRL_REG, 0xffffffff, 0x00090008,
+	mmCGTS_CU6_SP0_CTRL_REG,    0xffffffff, 0x00010000,
+	mmCGTS_CU6_LDS_SQ_CTRL_REG, 0xffffffff, 0x00030002,
+	mmCGTS_CU6_TA_CTRL_REG,     0xffffffff, 0x00040007,
+	mmCGTS_CU6_SP1_CTRL_REG,    0xffffffff, 0x00060005,
+	mmCGTS_CU6_TD_TCP_CTRL_REG, 0xffffffff, 0x00090008,
+	mmCGTS_CU7_SP0_CTRL_REG,    0xffffffff, 0x00010000,
+	mmCGTS_CU7_LDS_SQ_CTRL_REG, 0xffffffff, 0x00030002,
+	mmCGTS_CU7_TA_CTRL_REG,     0xffffffff, 0x00040007,
+	mmCGTS_CU7_SP1_CTRL_REG,    0xffffffff, 0x00060005,
+	mmCGTS_CU7_TD_TCP_CTRL_REG, 0xffffffff, 0x00090008,
+	mmCGTS_SM_CTRL_REG,         0xffffffff, 0x96e00200,
+	mmCP_RB_WPTR_POLL_CNTL,     0xffffffff, 0x00900100,
+	mmRLC_CGCG_CGLS_CTRL,       0xffffffff, 0x0020003c,
+	mmPCIE_INDEX,               0xffffffff, 0x0140001c,
+	mmPCIE_DATA,                0x000f0000, 0x00000000,
+	mmSMC_IND_INDEX_4,          0xffffffff, 0xC060000C,
+	mmSMC_IND_DATA_4,           0xc0000fff, 0x00000100,
+	mmXDMA_CLOCK_GATING_CNTL,   0xffffffff, 0x00000100,
+	mmXDMA_MEM_POWER_CNTL,      0x00000101, 0x00000000,
+	mmMC_MEM_POWER_LS,          0xffffffff, 0x00000104,
+	mmCGTT_DRM_CLK_CTRL0,       0xff000fff, 0x00000100,
+	mmHDP_XDP_CGTT_BLK_CTRL,    0xc0000fff, 0x00000104,
+	mmCP_MEM_SLP_CNTL,          0x00000001, 0x00000001,
+	mmSDMA0_CLK_CTRL,           0xff000ff0, 0x00000100,
+	mmSDMA1_CLK_CTRL,           0xff000ff0, 0x00000100,
+};
+
+static const u32 xgpu_tonga_golden_settings_a11[] = {
+	mmCB_HW_CONTROL, 0xfffdf3cf, 0x00007208,
+	mmCB_HW_CONTROL_3, 0x00000040, 0x00000040,
+	mmDB_DEBUG2, 0xf00fffff, 0x00000400,
+	mmDCI_CLK_CNTL, 0x00000080, 0x00000000,
+	mmFBC_DEBUG_COMP, 0x000000f0, 0x00000070,
+	mmFBC_MISC, 0x1f311fff, 0x12300000,
+	mmGB_GPU_ID, 0x0000000f, 0x00000000,
+	mmHDMI_CONTROL, 0x31000111, 0x00000011,
+	mmMC_ARB_WTM_GRPWT_RD, 0x00000003, 0x00000000,
+	mmMC_HUB_RDREQ_DMIF_LIMIT, 0x0000007f, 0x00000028,
+	mmMC_HUB_WDP_UMC, 0x00007fb6, 0x00000991,
+	mmPA_SC_ENHANCE, 0xffffffff, 0x20000001,
+	mmPA_SC_FIFO_DEPTH_CNTL, 0x000003ff, 0x000000fc,
+	mmPA_SC_LINE_STIPPLE_STATE, 0x0000ff0f, 0x00000000,
+	mmRLC_CGCG_CGLS_CTRL, 0x00000003, 0x0000003c,
+	mmSDMA0_CHICKEN_BITS, 0xfc910007, 0x00810007,
+	mmSDMA0_CLK_CTRL, 0xff000fff, 0x00000000,
+	mmSDMA0_GFX_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA0_RLC0_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA0_RLC1_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA1_CHICKEN_BITS, 0xfc910007, 0x00810007,
+	mmSDMA1_CLK_CTRL, 0xff000fff, 0x00000000,
+	mmSDMA1_GFX_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA1_RLC0_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSDMA1_RLC1_IB_CNTL, 0x800f0111, 0x00000100,
+	mmSQ_RANDOM_WAVE_PRI, 0x001fffff, 0x000006fd,
+	mmTA_CNTL_AUX, 0x000f000f, 0x000b0000,
+	mmTCC_CTRL, 0x00100000, 0xf31fff7f,
+	mmTCC_EXE_DISABLE, 0x00000002, 0x00000002,
+	mmTCP_ADDR_CONFIG, 0x000003ff, 0x000002fb,
+	mmTCP_CHAN_STEER_HI, 0xffffffff, 0x0000543b,
+	mmTCP_CHAN_STEER_LO, 0xffffffff, 0xa9210876,
+	mmVGT_RESET_DEBUG, 0x00000004, 0x00000004,
+	mmVM_PRT_APERTURE0_LOW_ADDR, 0x0fffffff, 0x0fffffff,
+	mmVM_PRT_APERTURE1_LOW_ADDR, 0x0fffffff, 0x0fffffff,
+	mmVM_PRT_APERTURE2_LOW_ADDR, 0x0fffffff, 0x0fffffff,
+	mmVM_PRT_APERTURE3_LOW_ADDR, 0x0fffffff, 0x0fffffff,
+};
+
+static const u32 xgpu_tonga_golden_common_all[] = {
+	mmGRBM_GFX_INDEX,               0xffffffff, 0xe0000000,
+	mmPA_SC_RASTER_CONFIG,          0xffffffff, 0x16000012,
+	mmPA_SC_RASTER_CONFIG_1,        0xffffffff, 0x0000002A,
+	mmGB_ADDR_CONFIG,               0xffffffff, 0x22011002,
+	mmSPI_RESOURCE_RESERVE_CU_0,    0xffffffff, 0x00000800,
+	mmSPI_RESOURCE_RESERVE_CU_1,    0xffffffff, 0x00000800,
+	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00007FBF,
+};
+
+void xgpu_vi_init_golden_registers(struct amdgpu_device *adev)
+{
+	switch (adev->asic_type) {
+	case CHIP_FIJI:
+		amdgpu_program_register_sequence(adev,
+						 xgpu_fiji_mgcg_cgcg_init,
+						 (const u32)ARRAY_SIZE(
+						 xgpu_fiji_mgcg_cgcg_init));
+		amdgpu_program_register_sequence(adev,
+						 xgpu_fiji_golden_settings_a10,
+						 (const u32)ARRAY_SIZE(
+						 xgpu_fiji_golden_settings_a10));
+		amdgpu_program_register_sequence(adev,
+						 xgpu_fiji_golden_common_all,
+						 (const u32)ARRAY_SIZE(
+						 xgpu_fiji_golden_common_all));
+		break;
+	case CHIP_TONGA:
+		amdgpu_program_register_sequence(adev,
+						 xgpu_tonga_mgcg_cgcg_init,
+						 (const u32)ARRAY_SIZE(
+						 xgpu_tonga_mgcg_cgcg_init));
+		amdgpu_program_register_sequence(adev,
+						 xgpu_tonga_golden_settings_a11,
+						 (const u32)ARRAY_SIZE(
+						 xgpu_tonga_golden_settings_a11));
+		amdgpu_program_register_sequence(adev,
+						 xgpu_tonga_golden_common_all,
+						 (const u32)ARRAY_SIZE(
+						 xgpu_tonga_golden_common_all));
+		break;
+	default:
+		BUG_ON("Doesn't support chip type.\n");
+		break;
+	}
+}
+
+/*
+ * Mailbox communication between GPU hypervisor and VFs
+ */
+static void xgpu_vi_mailbox_send_ack(struct amdgpu_device *adev)
+{
+	u32 reg;
+
+	reg = RREG32(mmMAILBOX_CONTROL);
+	reg = REG_SET_FIELD(reg, MAILBOX_CONTROL, RCV_MSG_ACK, 1);
+	WREG32(mmMAILBOX_CONTROL, reg);
+}
+
+static void xgpu_vi_mailbox_set_valid(struct amdgpu_device *adev, bool val)
+{
+	u32 reg;
+
+	reg = RREG32(mmMAILBOX_CONTROL);
+	reg = REG_SET_FIELD(reg, MAILBOX_CONTROL,
+			    TRN_MSG_VALID, val ? 1 : 0);
+	WREG32(mmMAILBOX_CONTROL, reg);
+}
+
+static void xgpu_vi_mailbox_trans_msg(struct amdgpu_device *adev,
+				      enum idh_event event)
+{
+	u32 reg;
+
+	reg = RREG32(mmMAILBOX_MSGBUF_TRN_DW0);
+	reg = REG_SET_FIELD(reg, MAILBOX_MSGBUF_TRN_DW0,
+			    MSGBUF_DATA, event);
+	WREG32(mmMAILBOX_MSGBUF_TRN_DW0, reg);
+
+	xgpu_vi_mailbox_set_valid(adev, true);
+}
+
+static int xgpu_vi_mailbox_rcv_msg(struct amdgpu_device *adev,
+				   enum idh_event event)
+{
+	u32 reg;
+
+	reg = RREG32(mmMAILBOX_MSGBUF_RCV_DW0);
+	if (reg != event)
+		return -ENOENT;
+
+	/* send ack to PF */
+	xgpu_vi_mailbox_send_ack(adev);
+
+	return 0;
+}
+
+static int xgpu_vi_poll_ack(struct amdgpu_device *adev)
+{
+	int r = 0, timeout = VI_MAILBOX_TIMEDOUT;
+	u32 mask = REG_FIELD_MASK(MAILBOX_CONTROL, TRN_MSG_ACK);
+	u32 reg;
+
+	reg = RREG32(mmMAILBOX_CONTROL);
+	while (!(reg & mask)) {
+		if (timeout <= 0) {
+			pr_err("Doesn't get ack from pf.\n");
+			r = -ETIME;
+			break;
+		}
+		msleep(1);
+		timeout -= 1;
+
+		reg = RREG32(mmMAILBOX_CONTROL);
+	}
+
+	return r;
+}
+
+static int xgpu_vi_poll_msg(struct amdgpu_device *adev, enum idh_event event)
+{
+	int r = 0, timeout = VI_MAILBOX_TIMEDOUT;
+
+	r = xgpu_vi_mailbox_rcv_msg(adev, event);
+	while (r) {
+		if (timeout <= 0) {
+			pr_err("Doesn't get ack from pf.\n");
+			r = -ETIME;
+			break;
+		}
+		msleep(1);
+		timeout -= 1;
+
+		r = xgpu_vi_mailbox_rcv_msg(adev, event);
+	}
+
+	return r;
+}
+
+static int xgpu_vi_send_access_requests(struct amdgpu_device *adev,
+					enum idh_request request)
+{
+	int r;
+
+	xgpu_vi_mailbox_trans_msg(adev, request);
+
+	/* start to poll ack */
+	r = xgpu_vi_poll_ack(adev);
+	if (r)
+		return r;
+
+	xgpu_vi_mailbox_set_valid(adev, false);
+
+	/* start to check msg if request is idh_req_gpu_init_access */
+	if (request == IDH_REQ_GPU_INIT_ACCESS) {
+		r = xgpu_vi_poll_msg(adev, IDH_READY_TO_ACCESS_GPU);
+		if (r)
+			return r;
+	}
+
+	return 0;
+}
+
+static int xgpu_vi_request_reset(struct amdgpu_device *adev)
+{
+	return xgpu_vi_send_access_requests(adev, IDH_REQ_GPU_RESET_ACCESS);
+}
+
+static int xgpu_vi_request_full_gpu_access(struct amdgpu_device *adev,
+					   bool init)
+{
+	enum idh_event event;
+
+	event = init ? IDH_REQ_GPU_INIT_ACCESS : IDH_REQ_GPU_FINI_ACCESS;
+	return xgpu_vi_send_access_requests(adev, event);
+}
+
+static int xgpu_vi_release_full_gpu_access(struct amdgpu_device *adev,
+					   bool init)
+{
+	enum idh_event event;
+	int r = 0;
+
+	event = init ? IDH_REL_GPU_INIT_ACCESS : IDH_REL_GPU_FINI_ACCESS;
+	r = xgpu_vi_send_access_requests(adev, event);
+
+	return r;
+}
+
+/* add support mailbox interrupts */
+static int xgpu_vi_mailbox_ack_irq(struct amdgpu_device *adev,
+				   struct amdgpu_irq_src *source,
+				   struct amdgpu_iv_entry *entry)
+{
+	DRM_DEBUG("get ack intr and do nothing.\n");
+	return 0;
+}
+
+static int xgpu_vi_set_mailbox_ack_irq(struct amdgpu_device *adev,
+				       struct amdgpu_irq_src *src,
+				       unsigned type,
+				       enum amdgpu_interrupt_state state)
+{
+	u32 tmp = RREG32(mmMAILBOX_INT_CNTL);
+
+	tmp = REG_SET_FIELD(tmp, MAILBOX_INT_CNTL, ACK_INT_EN,
+			    (state == AMDGPU_IRQ_STATE_ENABLE) ? 1 : 0);
+	WREG32(mmMAILBOX_INT_CNTL, tmp);
+
+	return 0;
+}
+
+static void xgpu_vi_mailbox_flr_work(struct work_struct *work)
+{
+	struct amdgpu_virt *virt = container_of(work,
+					struct amdgpu_virt, flr_work.work);
+	struct amdgpu_device *adev = container_of(virt,
+					struct amdgpu_device, virt);
+	int r = 0;
+
+	r = xgpu_vi_poll_msg(adev, IDH_FLR_NOTIFICATION_CMPL);
+	if (r)
+		DRM_ERROR("failed to get flr cmpl msg from hypervior.\n");
+
+	/* TODO: need to restore gfx states */
+}
+
+static int xgpu_vi_set_mailbox_rcv_irq(struct amdgpu_device *adev,
+				       struct amdgpu_irq_src *src,
+				       unsigned type,
+				       enum amdgpu_interrupt_state state)
+{
+	u32 tmp = RREG32(mmMAILBOX_INT_CNTL);
+
+	tmp = REG_SET_FIELD(tmp, MAILBOX_INT_CNTL, VALID_INT_EN,
+			    (state == AMDGPU_IRQ_STATE_ENABLE) ? 1 : 0);
+	WREG32(mmMAILBOX_INT_CNTL, tmp);
+
+	return 0;
+}
+
+static int xgpu_vi_mailbox_rcv_irq(struct amdgpu_device *adev,
+				   struct amdgpu_irq_src *source,
+				   struct amdgpu_iv_entry *entry)
+{
+	int r;
+
+	adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME;
+	r = xgpu_vi_mailbox_rcv_msg(adev, IDH_FLR_NOTIFICATION);
+	/* do nothing for other msg */
+	if (r)
+		return 0;
+
+	/* TODO: need to save gfx states */
+	schedule_delayed_work(&adev->virt.flr_work,
+			      msecs_to_jiffies(VI_MAILBOX_RESET_TIME));
+
+	return 0;
+}
+
+static const struct amdgpu_irq_src_funcs xgpu_vi_mailbox_ack_irq_funcs = {
+	.set = xgpu_vi_set_mailbox_ack_irq,
+	.process = xgpu_vi_mailbox_ack_irq,
+};
+
+static const struct amdgpu_irq_src_funcs xgpu_vi_mailbox_rcv_irq_funcs = {
+	.set = xgpu_vi_set_mailbox_rcv_irq,
+	.process = xgpu_vi_mailbox_rcv_irq,
+};
+
+void xgpu_vi_mailbox_set_irq_funcs(struct amdgpu_device *adev)
+{
+	adev->virt.ack_irq.num_types = 1;
+	adev->virt.ack_irq.funcs = &xgpu_vi_mailbox_ack_irq_funcs;
+	adev->virt.rcv_irq.num_types = 1;
+	adev->virt.rcv_irq.funcs = &xgpu_vi_mailbox_rcv_irq_funcs;
+}
+
+int xgpu_vi_mailbox_add_irq_id(struct amdgpu_device *adev)
+{
+	int r;
+
+	r = amdgpu_irq_add_id(adev, 135, &adev->virt.rcv_irq);
+	if (r)
+		return r;
+
+	r = amdgpu_irq_add_id(adev, 138, &adev->virt.ack_irq);
+	if (r) {
+		amdgpu_irq_put(adev, &adev->virt.rcv_irq, 0);
+		return r;
+	}
+
+	return 0;
+}
+
+int xgpu_vi_mailbox_get_irq(struct amdgpu_device *adev)
+{
+	int r;
+
+	r = amdgpu_irq_get(adev, &adev->virt.rcv_irq, 0);
+	if (r)
+		return r;
+	r = amdgpu_irq_get(adev, &adev->virt.ack_irq, 0);
+	if (r) {
+		amdgpu_irq_put(adev, &adev->virt.rcv_irq, 0);
+		return r;
+	}
+
+	INIT_DELAYED_WORK(&adev->virt.flr_work, xgpu_vi_mailbox_flr_work);
+
+	return 0;
+}
+
+void xgpu_vi_mailbox_put_irq(struct amdgpu_device *adev)
+{
+	cancel_delayed_work_sync(&adev->virt.flr_work);
+	amdgpu_irq_put(adev, &adev->virt.ack_irq, 0);
+	amdgpu_irq_put(adev, &adev->virt.rcv_irq, 0);
+}
+
+const struct amdgpu_virt_ops xgpu_vi_virt_ops = {
+	.req_full_gpu		= xgpu_vi_request_full_gpu_access,
+	.rel_full_gpu		= xgpu_vi_release_full_gpu_access,
+	.reset_gpu		= xgpu_vi_request_reset,
+};
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h
@ -0,0 +1,55 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __MXGPU_VI_H__
+#define __MXGPU_VI_H__
+
+#define VI_MAILBOX_TIMEDOUT	150
+#define VI_MAILBOX_RESET_TIME	12
+
+/* VI mailbox messages request */
+enum idh_request {
+	IDH_REQ_GPU_INIT_ACCESS	= 1,
+	IDH_REL_GPU_INIT_ACCESS,
+	IDH_REQ_GPU_FINI_ACCESS,
+	IDH_REL_GPU_FINI_ACCESS,
+	IDH_REQ_GPU_RESET_ACCESS
+};
+
+/* VI mailbox messages data */
+enum idh_event {
+	IDH_CLR_MSG_BUF = 0,
+	IDH_READY_TO_ACCESS_GPU,
+	IDH_FLR_NOTIFICATION,
+	IDH_FLR_NOTIFICATION_CMPL,
+	IDH_EVENT_MAX
+};
+
+extern const struct amdgpu_virt_ops xgpu_vi_virt_ops;
+
+void xgpu_vi_init_golden_registers(struct amdgpu_device *adev);
+void xgpu_vi_mailbox_set_irq_funcs(struct amdgpu_device *adev);
+int xgpu_vi_mailbox_add_irq_id(struct amdgpu_device *adev);
+int xgpu_vi_mailbox_get_irq(struct amdgpu_device *adev);
+void xgpu_vi_mailbox_put_irq(struct amdgpu_device *adev);
+
+#endif
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@ -701,7 +701,7 @@ static int sdma_v2_4_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	ib.ptr[7] = SDMA_PKT_HEADER_OP(SDMA_OP_NOP);
 	ib.length_dw = 8;

-	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, NULL, &f);
+	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, &f);
 	if (r)
 		goto err1;

--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@ -910,7 +910,7 @@ static int sdma_v3_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	ib.ptr[7] = SDMA_PKT_NOP_HEADER_OP(SDMA_OP_NOP);
 	ib.length_dw = 8;

-	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, NULL, &f);
+	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, &f);
 	if (r)
 		goto err1;

@ -1533,6 +1533,22 @@ static int sdma_v3_0_set_powergating_state(void *handle,
 	return 0;
 }

+static void sdma_v3_0_get_clockgating_state(void *handle, u32 *flags)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+	int data;
+
+	/* AMD_CG_SUPPORT_SDMA_MGCG */
+	data = RREG32(mmSDMA0_CLK_CTRL + sdma_offsets[0]);
+	if (!(data & SDMA0_CLK_CTRL__SOFT_OVERRIDE0_MASK))
+		*flags |= AMD_CG_SUPPORT_SDMA_MGCG;
+
+	/* AMD_CG_SUPPORT_SDMA_LS */
+	data = RREG32(mmSDMA0_POWER_CNTL + sdma_offsets[0]);
+	if (data & SDMA0_POWER_CNTL__MEM_POWER_OVERRIDE_MASK)
+		*flags |= AMD_CG_SUPPORT_SDMA_LS;
+}
+
 static const struct amd_ip_funcs sdma_v3_0_ip_funcs = {
 	.name = "sdma_v3_0",
 	.early_init = sdma_v3_0_early_init,
@ -1551,6 +1567,7 @@ static const struct amd_ip_funcs sdma_v3_0_ip_funcs = {
 	.soft_reset = sdma_v3_0_soft_reset,
 	.set_clockgating_state = sdma_v3_0_set_clockgating_state,
 	.set_powergating_state = sdma_v3_0_set_powergating_state,
+	.get_clockgating_state = sdma_v3_0_get_clockgating_state,
 };

 static const struct amdgpu_ring_funcs sdma_v3_0_ring_funcs = {
--- a/drivers/gpu/drm/amd/amdgpu/si.c
+++ b/drivers/gpu/drm/amd/amdgpu/si.c
--- a/drivers/gpu/drm/amd/amdgpu/si_dma.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c
@ -24,7 +24,7 @@
 #include <drm/drmP.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
-#include "si/sid.h"
+#include "sid.h"

 const u32 sdma_offsets[SDMA_MAX_INSTANCE] =
 {
@ -301,7 +301,7 @@ static int si_dma_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 	ib.ptr[2] = upper_32_bits(gpu_addr) & 0xff;
 	ib.ptr[3] = 0xDEADBEEF;
 	ib.length_dw = 4;
-	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, NULL, &f);
+	r = amdgpu_ib_schedule(ring, 1, &ib, NULL, &f);
 	if (r)
 		goto err1;

--- a/drivers/gpu/drm/amd/amdgpu/si_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dpm.c
@ -26,7 +26,7 @@
 #include "amdgpu_pm.h"
 #include "amdgpu_dpm.h"
 #include "amdgpu_atombios.h"
-#include "si/sid.h"
+#include "sid.h"
 #include "r600_dpm.h"
 #include "si_dpm.h"
 #include "atom.h"
@ -3009,29 +3009,6 @@ static int si_init_smc_spll_table(struct amdgpu_device *adev)
 	return ret;
 }

-struct si_dpm_quirk {
-	u32 chip_vendor;
-	u32 chip_device;
-	u32 subsys_vendor;
-	u32 subsys_device;
-	u32 max_sclk;
-	u32 max_mclk;
-};
-
-/* cards with dpm stability problems */
-static struct si_dpm_quirk si_dpm_quirk_list[] = {
-	/* PITCAIRN - https://bugs.freedesktop.org/show_bug.cgi?id=76490 */
-	{ PCI_VENDOR_ID_ATI, 0x6810, 0x1462, 0x3036, 0, 120000 },
-	{ PCI_VENDOR_ID_ATI, 0x6811, 0x174b, 0xe271, 0, 120000 },
-	{ PCI_VENDOR_ID_ATI, 0x6811, 0x174b, 0x2015, 0, 120000 },
-	{ PCI_VENDOR_ID_ATI, 0x6810, 0x174b, 0xe271, 85000, 90000 },
-	{ PCI_VENDOR_ID_ATI, 0x6811, 0x1462, 0x2015, 0, 120000 },
-	{ PCI_VENDOR_ID_ATI, 0x6811, 0x1043, 0x2015, 0, 120000 },
-	{ PCI_VENDOR_ID_ATI, 0x6811, 0x148c, 0x2015, 0, 120000 },
-	{ PCI_VENDOR_ID_ATI, 0x6810, 0x1682, 0x9275, 0, 120000 },
-	{ 0, 0, 0, 0 },
-};
-
 static u16 si_get_lower_of_leakage_and_vce_voltage(struct amdgpu_device *adev,
 						   u16 vce_voltage)
 {
@ -3477,18 +3454,8 @@ static void si_apply_state_adjust_rules(struct amdgpu_device *adev,
 	u32 max_sclk_vddc, max_mclk_vddci, max_mclk_vddc;
 	u32 max_sclk = 0, max_mclk = 0;
 	int i;
-	struct si_dpm_quirk *p = si_dpm_quirk_list;

-	/* limit all SI kickers */
-	if (adev->asic_type == CHIP_PITCAIRN) {
-		if ((adev->pdev->revision == 0x81) ||
-		    (adev->pdev->device == 0x6810) ||
-		    (adev->pdev->device == 0x6811) ||
-		    (adev->pdev->device == 0x6816) ||
-		    (adev->pdev->device == 0x6817) ||
-		    (adev->pdev->device == 0x6806))
-			max_mclk = 120000;
-	} else if (adev->asic_type == CHIP_HAINAN) {
+	if (adev->asic_type == CHIP_HAINAN) {
 		if ((adev->pdev->revision == 0x81) ||
 		    (adev->pdev->revision == 0x83) ||
 		    (adev->pdev->revision == 0xC3) ||
@ -3498,18 +3465,6 @@ static void si_apply_state_adjust_rules(struct amdgpu_device *adev,
 			max_sclk = 75000;
 		}
 	}
-	/* Apply dpm quirks */
-	while (p && p->chip_device != 0) {
-		if (adev->pdev->vendor == p->chip_vendor &&
-		    adev->pdev->device == p->chip_device &&
-		    adev->pdev->subsystem_vendor == p->subsys_vendor &&
-		    adev->pdev->subsystem_device == p->subsys_device) {
-			max_sclk = p->max_sclk;
-			max_mclk = p->max_mclk;
-			break;
-		}
-		++p;
-	}

 	if (rps->vce_active) {
 		rps->evclk = adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].evclk;
@ -3906,25 +3861,25 @@ static int si_restrict_performance_levels_before_switch(struct amdgpu_device *ad
 }

 static int si_dpm_force_performance_level(struct amdgpu_device *adev,
-				   enum amdgpu_dpm_forced_level level)
+				   enum amd_dpm_forced_level level)
 {
 	struct amdgpu_ps *rps = adev->pm.dpm.current_ps;
 	struct  si_ps *ps = si_get_ps(rps);
 	u32 levels = ps->performance_level_count;

-	if (level == AMDGPU_DPM_FORCED_LEVEL_HIGH) {
+	if (level == AMD_DPM_FORCED_LEVEL_HIGH) {
 		if (si_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_SetEnabledLevels, levels) != PPSMC_Result_OK)
 			return -EINVAL;

 		if (si_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_SetForcedLevels, 1) != PPSMC_Result_OK)
 			return -EINVAL;
-	} else if (level == AMDGPU_DPM_FORCED_LEVEL_LOW) {
+	} else if (level == AMD_DPM_FORCED_LEVEL_LOW) {
 		if (si_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_SetForcedLevels, 0) != PPSMC_Result_OK)
 			return -EINVAL;

 		if (si_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_SetEnabledLevels, 1) != PPSMC_Result_OK)
 			return -EINVAL;
-	} else if (level == AMDGPU_DPM_FORCED_LEVEL_AUTO) {
+	} else if (level == AMD_DPM_FORCED_LEVEL_AUTO) {
 		if (si_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_SetForcedLevels, 0) != PPSMC_Result_OK)
 			return -EINVAL;

@ -7746,7 +7701,7 @@ static int si_dpm_sw_init(void *handle)
 	/* default to balanced state */
 	adev->pm.dpm.state = POWER_STATE_TYPE_BALANCED;
 	adev->pm.dpm.user_state = POWER_STATE_TYPE_BALANCED;
-	adev->pm.dpm.forced_level = AMDGPU_DPM_FORCED_LEVEL_AUTO;
+	adev->pm.dpm.forced_level = AMD_DPM_FORCED_LEVEL_AUTO;
 	adev->pm.default_sclk = adev->clock.default_sclk;
 	adev->pm.default_mclk = adev->clock.default_mclk;
 	adev->pm.current_sclk = adev->clock.default_sclk;
@ -8072,11 +8027,3 @@ static void si_dpm_set_irq_funcs(struct amdgpu_device *adev)
 	adev->pm.dpm.thermal.irq.funcs = &si_dpm_irq_funcs;
 }

-const struct amdgpu_ip_block_version si_dpm_ip_block =
-{
-	.type = AMD_IP_BLOCK_TYPE_SMC,
-	.major = 6,
-	.minor = 0,
-	.rev = 0,
-	.funcs = &si_dpm_ip_funcs,
-};
--- a/drivers/gpu/drm/amd/amdgpu/si_enums.h
+++ b/drivers/gpu/drm/amd/amdgpu/si_enums.h
@ -143,8 +143,8 @@
 #define RLC_CLEAR_STATE_DESCRIPTOR_OFFSET    0x3D

 #define TAHITI_GB_ADDR_CONFIG_GOLDEN        0x12011003
-#define VERDE_GB_ADDR_CONFIG_GOLDEN         0x12010002
-#define HAINAN_GB_ADDR_CONFIG_GOLDEN        0x02010001
+#define VERDE_GB_ADDR_CONFIG_GOLDEN         0x02010002
+#define HAINAN_GB_ADDR_CONFIG_GOLDEN        0x02011003

 #define PACKET3(op, n)  ((RADEON_PACKET_TYPE3 << 30) |                  \
                         (((op) & 0xFF) << 8) |                         \
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@ -23,7 +23,7 @@
 #include "drmP.h"
 #include "amdgpu.h"
 #include "amdgpu_ih.h"
-#include "si/sid.h"
+#include "sid.h"
 #include "si_ih.h"

 static void si_ih_set_interrupt_funcs(struct amdgpu_device *adev);
--- a/Show More
+++ b/Show More