Commit graph

916892 commits

Author SHA1 Message Date
Suzuki K Poulose d375b356e6 coresight: Fix support for sparsely populated ports
On some systems the firmware may not describe all the ports
connected to a component (e.g, for security reasons). This
could be especially problematic for "funnels" where we could
end up in modifying memory beyond the allocated space for
refcounts.

e.g, for a funnel with input ports listed 0, 3, 5, nr_inport = 3.
However the we could access refcnts[5] while checking for
references, like :

 [  526.110401] ==================================================================
 [  526.117988] BUG: KASAN: slab-out-of-bounds in funnel_enable+0x54/0x1b0
 [  526.124706] Read of size 4 at addr ffffff8135f9549c by task bash/1114
 [  526.131324]
 [  526.132886] CPU: 3 PID: 1114 Comm: bash Tainted: G S                5.4.25 #232
 [  526.140397] Hardware name: Qualcomm Technologies, Inc. SC7180 IDP (DT)
 [  526.147113] Call trace:
 [  526.149653]  dump_backtrace+0x0/0x188
 [  526.153431]  show_stack+0x20/0x2c
 [  526.156852]  dump_stack+0xdc/0x144
 [  526.160370]  print_address_description+0x3c/0x494
 [  526.165211]  __kasan_report+0x144/0x168
 [  526.169170]  kasan_report+0x10/0x18
 [  526.172769]  check_memory_region+0x1a4/0x1b4
 [  526.177164]  __kasan_check_read+0x18/0x24
 [  526.181292]  funnel_enable+0x54/0x1b0
 [  526.185072]  coresight_enable_path+0x104/0x198
 [  526.189649]  coresight_enable+0x118/0x26c

  ...

 [  526.237782] Allocated by task 280:
 [  526.241298]  __kasan_kmalloc+0xf0/0x1ac
 [  526.245249]  kasan_kmalloc+0xc/0x14
 [  526.248849]  __kmalloc+0x28c/0x3b4
 [  526.252361]  coresight_register+0x88/0x250
 [  526.256587]  funnel_probe+0x15c/0x228
 [  526.260365]  dynamic_funnel_probe+0x20/0x2c
 [  526.264679]  amba_probe+0xbc/0x158
 [  526.268193]  really_probe+0x144/0x408
 [  526.271970]  driver_probe_device+0x70/0x140

 ...

 [  526.316810]
 [  526.318364] Freed by task 0:
 [  526.321344] (stack is not available)
 [  526.325024]
 [  526.326580] The buggy address belongs to the object at ffffff8135f95480
 [  526.326580]  which belongs to the cache kmalloc-128 of size 128
 [  526.339439] The buggy address is located 28 bytes inside of
 [  526.339439]  128-byte region [ffffff8135f95480, ffffff8135f95500)
 [  526.351399] The buggy address belongs to the page:
 [  526.356342] page:ffffffff04b7e500 refcount:1 mapcount:0 mapping:ffffff814b00c380 index:0x0 compound_mapcount: 0
 [  526.366711] flags: 0x4000000000010200(slab|head)
 [  526.371475] raw: 4000000000010200 ffffffff05034008 ffffffff0501eb08 ffffff814b00c380
 [  526.379435] raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000
 [  526.387393] page dumped because: kasan: bad access detected
 [  526.393128]
 [  526.394681] Memory state around the buggy address:
 [  526.399619]  ffffff8135f95380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [  526.407046]  ffffff8135f95400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [  526.414473] >ffffff8135f95480: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [  526.421900]                             ^
 [  526.426029]  ffffff8135f95500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [  526.433456]  ffffff8135f95580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [  526.440883] ==================================================================

To keep the code simple, we now track the maximum number of
possible input/output connections to/from this component
@ nr_inport and nr_outport in platform_data, respectively.
Thus the output connections could be sparse and code is
adjusted to skip the unspecified connections.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Reported-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Tested-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-13-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:16 +02:00
Jason Yan 1c33c65cfe coresight: etb10: Make coresight_etb_groups static
Fix the following sparse warning:

drivers/hwtracing/coresight/coresight-etb10.c:720:30: warning: symbol
'coresight_etb_groups' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-12-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:16 +02:00
Jason Yan ebd9b67850 coresight: cti: Make some symbols static
Fix the following sparse warning:

drivers/hwtracing/coresight/coresight-cti.c:22:1: warning: symbol
'ect_net' was not declared. Should it be static?
drivers/hwtracing/coresight/coresight-cti.c:625:32: warning: symbol
'cti_ops_ect' was not declared. Should it be static?
drivers/hwtracing/coresight/coresight-cti.c:630:28: warning: symbol
'cti_ops' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-11-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:16 +02:00
Sai Prakash Ranjan 41e8c7205c coresight: etm4x: Replace ETM PIDs with UCI IDs for Kryo385
Replace the AMBA ETM PIDs with UCI IDs to avoid future
conflicts when adding the CTI support for QCOM Kryo385
CPU cores.

Fixes: 17b4add0d4 ("coresight: etm4x: Add ETM PIDs for SDM845 and MSM8996")
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-10-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:16 +02:00
Sai Prakash Ranjan 63314ca20f coresight: etm4x: Add support for Qualcomm SC7180 SoC
Add ETM UCI IDs for Qualcomm SC7180 SoC. It has 2
big CPU cores based on Cortex-A76 and 6 LITTLE CPU
cores based on Cortex-A55.

Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Tested-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-9-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:16 +02:00
Mauro Carvalho Chehab 7f06a1c989 docs: trace: coresight-ect.rst: Fix a build warning
Sphinx wants a line after "..", as otherwise it complains with:

	Documentation/trace/coresight/coresight-ect.rst:2: WARNING: Explicit markup ends without a blank line; unexpected unindent.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-8-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:16 +02:00
Mike Leach 5153e57bf8 coresight: docs: Add information about the topology representations
Update the CoreSight documents to describe the new connections directory
and the links between CoreSight devices in this directory.

Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-7-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:16 +02:00
Mike Leach 73274abb65 coresight: cti: Add in sysfs links to other coresight devices
Adds in sysfs links for connections where the connected device is another
coresight device. This allows examination of the coresight topology.

Non-coresight connections remain just as a reference name.

Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-6-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:15 +02:00
Suzuki K Poulose 8a7365c2d4 coresight: Expose device connections via sysfs
Coresight device connections are a bit complicated and is not
exposed currently to the user. One has to look at the platform
descriptions (DT bindings or ACPI bindings) to make an understanding.
Given the new naming scheme, it will be helpful to have this information
to choose the appropriate devices for tracing. This patch exposes
the device connections via links in the sysfs directories.

e.g, for a connection devA[OutputPort_X] -> devB[InputPort_Y]
is represented as two symlinks:

  /sys/bus/coresight/.../devA/out:X -> /sys/bus/coresight/.../devB
  /sys/bus/coresight/.../devB/in:Y  -> /sys/bus/coresight/.../devA

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[Revised to use the generic sysfs links functions & link structures.
Provides a connections sysfs group in each device to hold the links.]
Co-developed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-5-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:15 +02:00
Mike Leach 8096152588 coresight: Add generic sysfs link creation functions
To allow the connections between coresight components to be represented
in sysfs, generic methods for creating sysfs links between two coresight
devices are added.

Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-4-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:15 +02:00
Suzuki K Poulose 68a5d5fccb coresight: Add return value for fixup connections
Handle failures in fixing up connections for a newly registered
device. This will be useful to handle cases where we fail to expose
the links via sysfs for the connections.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-3-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:15 +02:00
Suzuki K Poulose d60250a459 coresight: Pass coresight_device for coresight_release_platform_data
As we prepare to expose the links between the devices in
sysfs, pass the coresight_device instance to the
coresight_release_platform_data in order to free up the connections
when the device is removed.

No functional changes as such in this patch.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200518180242.7916-2-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-19 16:31:15 +02:00
Greg Kroah-Hartman 2bd7d8df3e This tag contains the following changes for kernel 5.8:
- GAUDI ASIC support. The tag contains code and header files needed to
   initialize the GAUDI ASIC and run workloads on it. There are changes to
   the common code that are needed for GAUDI and there is the addition of
   new ASIC-dependent code of GAUDI.
 
 - Add new feature of signal/wait command submissions. This is relevant to
   GAUDI only and allows the user to sync between streams (queues) inside
   the device.
 
 - Allow user to retrieve the device time alongside the host time, to allow
   a user application to synchronize device time together with host
   time during profiling.
 
 - Change ASIC's CPU initialization by loading its boot loader code from the
   Host memory (instead of it being programmed on the on-board FLASH).
 
 - Expose more attributes through HWMON.
 
 - Move the ASIC event handling code to be "common code". This will be
   shared between GAUDI and future ASICs. Goya will still use its own code.
 
 - Fix bug in command submission parsing in Goya.
 
 - Small fixes to security configuration (open up some registers for user
   access).
 
 - Improvements to ASIC reset code.
 -----BEGIN PGP SIGNATURE-----
 
 iQFKBAABCgA0FiEE7TEboABC71LctBLFZR1NuKta54AFAl7D2QYWHG9kZWQuZ2Fi
 YmF5QGdtYWlsLmNvbQAKCRBlHU24q1rngFEFB/9Dmnrqay6JiL9D2JmfNcijz+by
 To+p+O6FEFzJ8PDqffFQ7Z38w4/jgYqHNokfU594eqT73lU5Jqa57XeaWo3mrljy
 7VqSGti7Aa+xC0FW1gZiV6r/n12Cj+9v/p3aAn1RLYJsYgNzVMD+tIwdJW7sfKox
 o1jvYiZZ3l4eYW60wBMTLil/HUba9Nn2LFmYIeyIpRlXdssLYO7NBQYJ2PAEw8HB
 ZohLY6ixkdmAHa3YXaoS98KKva45bEAOWWJ1cbphhuXkqhoMRqzPyEofRBEyK1lF
 wnm2QZva8jYASY2RW3JAZx90d+NldtGQV72S4uJYnaB+lbXJaCWc0E89TwPK
 =r1m4
 -----END PGP SIGNATURE-----

Merge tag 'misc-habanalabs-next-2020-05-19' of git://people.freedesktop.org/~gabbayo/linux into char-misc-next

Oded writes:

This tag contains the following changes for kernel 5.8:

- GAUDI ASIC support. The tag contains code and header files needed to
  initialize the GAUDI ASIC and run workloads on it. There are changes to
  the common code that are needed for GAUDI and there is the addition of
  new ASIC-dependent code of GAUDI.

- Add new feature of signal/wait command submissions. This is relevant to
  GAUDI only and allows the user to sync between streams (queues) inside
  the device.

- Allow user to retrieve the device time alongside the host time, to allow
  a user application to synchronize device time together with host
  time during profiling.

- Change ASIC's CPU initialization by loading its boot loader code from the
  Host memory (instead of it being programmed on the on-board FLASH).

- Expose more attributes through HWMON.

- Move the ASIC event handling code to be "common code". This will be
  shared between GAUDI and future ASICs. Goya will still use its own code.

- Fix bug in command submission parsing in Goya.

- Small fixes to security configuration (open up some registers for user
  access).

- Improvements to ASIC reset code.

* tag 'misc-habanalabs-next-2020-05-19' of git://people.freedesktop.org/~gabbayo/linux: (38 commits)
  habanalabs: update patched_cb_size for Wreg32
  habanalabs: move event handling to common firmware file
  habanalabs: enable gaudi code in driver
  habanalabs: add gaudi profiler module
  habanalabs: add gaudi security module
  habanalabs: add hwmgr module for gaudi
  habanalabs: add gaudi asic-dependent code
  uapi: habanalabs: add gaudi defines
  habanalabs: add gaudi asic registers header files
  habanalabs: get card type, location from F/W
  habanalabs: support clock gating enable/disable
  habanalabs: set PM profile to auto only for goya
  habanalabs: add dedicated define for hard reset
  habanalabs: check if CoreSight is supported
  habanalabs: add signal/wait to CS IOCTL operations
  habanalabs: handle the h/w sync object
  habanalabs: define ASIC-dependent interface for signal/wait
  uapi: habanalabs: add signal/wait operations
  habanalabs: add missing MODULE_DEVICE_TABLE
  habanalabs: print all CB handles as hex numbers
  ...
2020-05-19 16:22:38 +02:00
Rachel Stahl 87eaea1cf8 habanalabs: update patched_cb_size for Wreg32
The patch_cb_size is not updated for Wreg32 in its validate function, so
updated in goya_validate_cb.

Signed-off-by: Rachel Stahl <rstahl@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Ofir Bitton ebd8d12251 habanalabs: move event handling to common firmware file
Instead of writing similar event handling code for each ASIC, move the code
to the common firmware file. This code will be used for GAUDI and all
future ASICs.

In addition, add two new fields to the auto-generated events file: valid
and description. This will save the need to manually write the events
description in the source code and simplify the code.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay af57cb81a6 habanalabs: enable gaudi code in driver
Enable the GAUDI ASIC code in the pci probe callback of the driver so the
driver will handle GAUDI ASICs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman 79fc7a9fff habanalabs: add gaudi profiler module
Add the GAUDI code to initialize the ASIC's profiler. The profile receives
its initialization values from the user, same as in Goya, but the code to
initialize is in the driver because the configuration space of the
device is not directly exposed to the user.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman 3a3a5bf196 habanalabs: add gaudi security module
Add the code to initialize the security module of GAUDI. Similar to Goya,
we have two dedicated mechanisms for security: Range Registers and
Protection bits. Those mechanisms protect sensitive memory and
configuration areas inside the device.

In addition, in Gaudi we moved to a 3-level security scheme, where the F/W
runs with the highest security level (Privileged), the driver runs with a
less secured level (Secured) and the user is neither privileged nor
secured. The security module in the driver configures the Secured parts so
the user won't be able to access them. The Privileged parts are configured
by the F/W.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay bcaf415204 habanalabs: add hwmgr module for gaudi
The hwmgr module is responsible for messages sent to GAUDI F/W that are
not common to all habanalabs ASICs.

In GAUDI, we provide the user a simplified mode of controlling the ASIC
clock frequency. Instead of three different clocks, we present a single
clock property that the user can configure via sysfs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay ac0ae6a96a habanalabs: add gaudi asic-dependent code
Add the ASIC-dependent code for GAUDI. Supply (almost) all of the function
callbacks that the driver's common code need to initialize, finalize and
submit workloads to the GAUDI ASIC.

It also contains the code to initialize the F/W of the GAUDI ASIC and to
receive events from the F/W.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 466c7822b0 uapi: habanalabs: add gaudi defines
Add the new defines for GAUDI uapi interface. It includes the queue IDs,
the engine IDs, SRAM reserved space and Sync Manager reserved resources.

There is no new IOCTL or additional operations in existing IOCTLs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 2aad2bf81c habanalabs: add gaudi asic registers header files
Add the relevant GAUDI ASIC registers header files. These files are
generated automatically from a tool maintained by the VLSI engineers.

There are more files which are not upstreamed because only very few defines
from those files are used in the driver. For those files, we copied the
relevant defines into gaudi_regs.h and gaudi_masks.h, to reduce the size of
this patch.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman fca72fbb66 habanalabs: get card type, location from F/W
For Gaudi the driver gets two new additional properties from the F/W:
1. The card's type - PCI or PMC
2. The card's location in the Gaudi's box (relevant only for PMC).

The card's location is also passed to the user in the HW IP info structure
as it needs this property for establishing communication between Gaudis.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay ca62433f53 habanalabs: support clock gating enable/disable
In Gaudi there is a feature of clock gating certain engines.
Therefore, add this property to the device structure.

In addition, due to a limitation of this feature, the driver needs to
dynamically enable or disable this feature during run-time. Therefore, add
ASIC interface functions to enable/disable this function from the common
code.

Moreover, this feature must be turned off when the user wishes to debug the
ASIC by reading/writing registers and/or memory through the driver's
debugfs. Therefore, add an option to enable/disable clock gating via the
debugfs interface.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 803917f960 habanalabs: set PM profile to auto only for goya
For Gaudi, the driver doesn't change the PM profile automatically due to
device-controlled PM capabilities. Therefore, set the PM profile to auto
only for Goya so the driver's code to automatically change the profile
won't run on Gaudi.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman e09498b078 habanalabs: add dedicated define for hard reset
Gaudi requires longer waiting during reset due to closing of network ports.
Add this explanation to the relevant comment in the code and add a
dedicated define for this reset timeout period, instead of multiplying
another define.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman 9e5e49cd5b habanalabs: check if CoreSight is supported
Coresight is not supported on simulator, therefore add a boolean for
checking that (currently used by un-upstreamed code).

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman b75f22505a habanalabs: add signal/wait to CS IOCTL operations
Add the following two operations to the CS IOCTL:

Signal:

The signal operation is basically a command submission, that is created by
the driver upon user request. It will be implemented using a dedicated PQE
that will increment a specific SOB. There will be a new flag:
HL_CS_FLAGS_SIGNAL. When the user set this flag in the CS IOCTL structure,
the driver will execute a dedicated code path that will prepare this
special PQE and submit it. The user only needs to provide a queue index on
which to put the signal.

Wait:

The wait operation is also a command submission that is created by the
driver upon user request. It will be implemented using a dedicated PQE that
will contain packets of "ARM a monitor" + FENCE packet. There will be a new
flag: HL_CS_FLAGS_WAIT. When the user set this flag in the CS structure,
the driver will execute a dedicated code path that will prepare this
special PQE and submit it.

The user needs to provide the following parameters:
1. queue ID
2. an array of signal_seq numbers and the number of signals to wait on
   (the length of signal_seq_arr).

The IOCTL will return the CS sequence number of the wait it put on the
queue ID.

Currently, the code supports signal_seq_nr==1. But this API definition will
allow us to put a single PQE that waits on multiple signals.

To correctly configure the monitor and fence, the driver will need to
retrieve the specified signal CS object that contains the relevant SOB and
its expected value. In case the signal CS has already been completed, there
is no point of adding a wait operation. In this case, the driver will
return to the user *without* putting anything on the PQ. The return code
should reflect to the user that the signal was completed, as we won't
return a CS sequence number for this wait.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman b0b5d92579 habanalabs: handle the h/w sync object
Define a structure representing the h/w sync object (SOB).

a SOB can contain up to 2^15 values. Each signal CS will increment the SOB
by 1, so after some time we will reach the maximum number the SOB can
represent. When that happens, the driver needs to move to a different SOB
for the signal operation.

A SOB can be in 1 of 4 states:

1. Working state with value < 2^15

2. We reached a value of 2^15, but the signal operations weren't completed
yet OR there are pending waits on this signal. For the next submission, the
driver will move to another SOB.

3. ALL the signal operations on the SOB have finished AND there are no more
pending waits on the SOB AND we reached a value of 2^15 (This basically
means the refcnt of the SOB is 0 - see explanation below). When that
happens, the driver can clear the SOB by simply doing WREG32 0 to it and
set the refcnt back to 1.

4. The SOB is cleared and can be used next time by the driver when it needs
to reuse an SOB.

Per SOB, the driver will maintain a single refcnt, that will be initialized
to 1. When a signal or wait operation on this SOB is submitted to the PQ,
the refcnt will be incremented. When a signal or wait operation on this SOB
completes, the refcnt will be decremented. After the submission of the
signal operation that increments the SOB to a value of 2^15, the refcnt is
also decremented.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman ec2f8a306a habanalabs: define ASIC-dependent interface for signal/wait
This feature requires handling h/w resources which are a bit different from
one ASIC to the other. Therefore, we need to define a set of interfaces the
ASIC code provides to the common code to signal, wait, reset sync object
and to reset and init a queue.

As this feature is not supported in Goya, provide an empty implementation
of those functions.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Omer Shpigelman f9e5f29518 uapi: habanalabs: add signal/wait operations
This is a pre-requisite to upstreaming GAUDI support.

Signal/wait operations are done by the user to perform sync between two
Primary Queues (PQs). The sync is done using the sync manager and it is
usually resolved inside the device, but sometimes it can be resolved in the
host, i.e. the user should be able to wait in the host until a signal has
been completed.

The mechanism to define signal and wait operations is done by the driver
because it needs atomicity and serialization, which is already done in the
driver when submitting work to the different queues.

To implement this feature, the driver "takes" a couple of h/w resources,
and this is reflected by the defines added to the uapi file.

The signal/wait operations are done via the existing CS IOCTL, and they use
the same data structure. There is a difference in the meaning of some of
the parameters, and for that we added unions to make the code more
readable.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 824b457839 habanalabs: add missing MODULE_DEVICE_TABLE
PCI drivers should use this define to declare their PCI ID table.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Dotan Barak 0a62c3926e habanalabs: print all CB handles as hex numbers
Make all the CB handles printed in the same way and not some as decimal and
some as hex numbers.

Signed-off-by: Dotan Barak <dbarak@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 010a118cfe habanalabs: update F/W register map
Update the mapping to the latest one used by the Firmware. No impact on the
driver in this update.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Adam Aharon aa9dd58bcc habanalabs: enable trace data compression (profiler)
Set the STMTCSR.COMPEN bit to enable leading-zero trace data
compression functionality for the extended stimulus ports.

Signed-off-by: Adam Aharon <aaharon@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Ofir Bitton 47f6b41cdd habanalabs: load CPU device boot loader from host
Load CPU device boot loader during driver boot time in order to avoid flash
write for every boot loader update.

To preserve backward-compatibility, skip the device boot load if the device
doesn't request it.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 39b425170d habanalabs: leave space for 2xMSG_PROT in CB
The user must leave space for 2xMSG_PROT in the external CB, so adjust the
define of max size accordingly. The driver, however, can still create a CB
with the maximum size of 2MB. Therefore, we need to add a check
specifically for the user requested size.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Christine Gharzuzi 8e708af284 habanalabs: support hwmon_reset_history attribute
Support hwmon_temp_reset_histroy, hwmon_in_reset_history and
hwmon_curr_reset attribute which resets the historical highest value.

Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Tomer Tayar 79c823c57e habanalabs: Align protection bits configuration of all TPCs
Align the protection bits configuration of all TPC cores to be as of TPC
core 0.

Fixes: a513f9a7ec ("habanalabs: make tpc registers secured")

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Tomer Tayar eef544f746 habanalabs: Allow access to TPC LFSR register
Allow user access to TPC LFSR register, as it might be accessed by TPC
kernels.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Tomer Tayar 25e7aeba60 habanalabs: Add INFO IOCTL opcode for time sync information
Add a new opcode to the INFO IOCTL that retrieves the device time
alongside the host time, to allow a user application that want to measure
device time together with host time (such as a profiler) to synchronize
these times.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
kbuild test robot ba7193c952 habanalabs: hl_pci_set_dma_mask() can be static
set function to be static as it is not called from outside its file.

Signed-off-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-19 14:48:41 +03:00
Oded Gabbay 926ba4cce1 habanalabs: handle barriers in DMA QMAN streams
When we have DMA QMAN with multiple streams, we need to know whether the
command buffer contains at least one DMA packet in order to configure the
barriers correctly when adding the 2xMSG_PROT at the end of the JOB. If
there is no DMA packet, then there is no need to put engine barrier. This
is relevant only for GAUDI as GOYA doesn't have streams so the engine can't
be busy by another stream.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-17 12:06:22 +03:00
Oded Gabbay cb056b9fd5 habanalabs: retrieve DMA mask indication from firmware
Retrieve from the firmware the DMA mask value we need to set according to
the device's PCI controller configuration. This is needed when working on
POWER9 machines, as the device's PCI controller is configured in a
different way in those machines.

Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-17 12:06:22 +03:00
Oded Gabbay c8aee597bb habanalabs: update firmware definitions
Add comments for the various errors and states of the firmware during boot.
Add a mapping of a new register that will tell the driver whether the
firmware executed the request from the driver or if it has encountered an
error.
Add a new enum for the possible values of this register.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-17 12:06:22 +03:00
Oded Gabbay 7a65ee046b habanalabs: increase timeout during reset
When doing training, the DL framework (e.g. tensorflow) performs hundreds
of thousands of memory allocations and mappings. In case the driver needs
to perform hard-reset during training, the driver kills the application and
unmaps all those memory allocations. Unfortunately, because of that large
amount of mappings, the driver isn't able to do that in the current timeout
(5 seconds). Therefore, increase the timeout significantly to 30 seconds
to avoid situation where the driver resets the device with active mappings,
which sometime can cause a kernel bug.

BTW, it doesn't mean we will spend all the 30 seconds because the reset
thread checks every one second if the unmap operation is done.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-17 12:06:22 +03:00
Oded Gabbay 49aba0bbab habanalabs: print warning when reset is requested
When the system administrator asks the driver to soft or hard reset the
device through sysfs, the driver should display a warning in the kernel log
to explain why it suddenly resets the device.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-17 12:06:22 +03:00
Oded Gabbay 7e1c07dd35 habanalabs: unify and improve device cpu init
Move the code of device CPU initialization from being ASIC-Dependent to
common code. In addition, add support for the new error reporting feature
of the firmware boot code.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-17 12:06:22 +03:00
Omer Shpigelman 1fa185c656 habanalabs: re-factor H/W queues initialization
We want to remove the following restrictions/assumptions in our driver:
1. The H/W queue index is also the completion queue index.
2. The H/W queue index is also the IRQ number of the completion queue.
3. All queues of the same type have consecutive indexes.

Therefore we add the support for H/W queues of the same type with
nonconsecutive indexes and completion queue index and IRQ number different
than the H/W queue index.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-17 12:06:22 +03:00
Omer Shpigelman 76cedc739d habanalabs: remove stop-on-error flag from DMA
Stop-on-error mode in DMA is useful as it stops the transaction
immediately upon error e.g. page fault.
But it may cause the next command submission to fail as is leaves the DMA
in unstable state.
Therefore we remove the stop-on-error configuration from the DMA.
Stop-on-err is still available for debug.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-05-17 12:06:22 +03:00