1
0
Fork 0
Commit Graph

1550 Commits (5260c2b863ef1152445ce93476c95d8c8a727eef)

Author SHA1 Message Date
Linus Torvalds 22331f8952 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu-feature updates from Ingo Molnar:

 - Rework the Intel model names symbols/macros, which were decades of
   ad-hoc extensions and added random noise. It's now a coherent, easy
   to follow nomenclature.

 - Add new Intel CPU model IDs:
    - "Tiger Lake" desktop and mobile models
    - "Elkhart Lake" model ID
    - and the "Lightning Mountain" variant of Airmont, plus support code

 - Add the new AVX512_VP2INTERSECT instruction to cpufeatures

 - Remove Intel MPX user-visible APIs and the self-tests, because the
   toolchain (gcc) is not supporting it going forward. This is the
   first, lowest-risk phase of MPX removal.

 - Remove X86_FEATURE_MFENCE_RDTSC

 - Various smaller cleanups and fixes

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  x86/cpu: Update init data for new Airmont CPU model
  x86/cpu: Add new Airmont variant to Intel family
  x86/cpu: Add Elkhart Lake to Intel family
  x86/cpu: Add Tiger Lake to Intel family
  x86: Correct misc typos
  x86/intel: Add common OPTDIFFs
  x86/intel: Aggregate microserver naming
  x86/intel: Aggregate big core graphics naming
  x86/intel: Aggregate big core mobile naming
  x86/intel: Aggregate big core client naming
  x86/cpufeature: Explain the macro duplication
  x86/ftrace: Remove mcount() declaration
  x86/PCI: Remove superfluous returns from void functions
  x86/msr-index: Move AMD MSRs where they belong
  x86/cpu: Use constant definitions for CPU models
  lib: Remove redundant ftrace flag removal
  x86/crash: Remove unnecessary comparison
  x86/bitops: Use __builtin_constant_p() directly instead of IS_IMMEDIATE()
  x86: Remove X86_FEATURE_MFENCE_RDTSC
  x86/mpx: Remove MPX APIs
  ...
2019-09-16 18:47:53 -07:00
Isaac Vaughn 3e443eb353 EDAC/amd64: Add PCI device IDs for family 17h, model 70h
Add the new Family 17h Model 70h PCI IDs (device 18h functions 0 and 6)
to the AMD64 EDAC module.

 [ bp: s/f17_base_addr_to_cs_size/f17_addr_mask_to_cs_size/g ]

Signed-off-by: Isaac Vaughn <isaac.vaughn@knights.ucf.edu>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: James Morse <james.morse@arm.com>
Cc: linux-edac@vger.kernel.org
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Robert Richter <rrichter@marvell.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190906192131.8ced0ca112146f32d82b6cae@knights.ucf.edu
2019-09-07 07:29:27 +02:00
Robert Richter e701f41203 EDAC/mc_sysfs: Make debug messages consistent
Debug messages are inconsistently used in the error handlers. Some lack
an error message, some are called regardless of the return status,
messages for the same error are at different locations in the code
depending on the error code. This happens esp. near put_device() calls.

Make those debug messages more consistent. Additionally, unify the error
messages to have the same terms for the same operations of the device.

Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190902123216.9809-5-rrichter@marvell.com
2019-09-04 11:39:19 +02:00
Robert Richter 644110e17d EDAC/mc_sysfs: Remove pointless gotos
Use direct returns instead of gotos. Error handling code becomes
smaller and better readable.

Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190902123216.9809-4-rrichter@marvell.com
2019-09-03 19:24:10 +02:00
Robert Richter d55c79ac86 EDAC: Prefer 'unsigned int' to bare use of 'unsigned'
Use of 'unsigned int' instead of bare use of 'unsigned'. Fix this for
edac_mc*, ghes and the i5100 driver as reported by checkpatch.pl.

While at it, struct member dev_ch_attribute->channel is always used as
unsigned int. Change type to unsigned int to avoid type casts.

 [ bp: Massage. ]

Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190902123216.9809-2-rrichter@marvell.com
2019-09-03 19:21:19 +02:00
Peter Zijlstra 5ebb34edbe x86/intel: Aggregate microserver naming
Currently big microservers have _XEON_D while small microservers have
_X, Make it uniformly: _D.

for i in `git grep -l "\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*_\(X\|XEON_D\)"`
do
	sed -i -e 's/\(\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*ATOM.*\)_X/\1_D/g' \
	       -e 's/\(\(INTEL_FAM6_\|VULNWL_INTEL\|INTEL_CPU_FAM6\).*\)_XEON_D/\1_D/g' ${i}
done

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20190827195122.677152989@infradead.org
2019-08-28 11:29:32 +02:00
Yazen Ghannam 81f5090db8 EDAC/amd64: Support asymmetric dual-rank DIMMs
Future AMD systems will support asymmetric dual-rank DIMMs. These are
DIMMs where the ranks are of different sizes.

The even rank will use the Primary Even Chip Select registers and the
odd rank will use the Secondary Odd Chip Select registers.

Recognize if a Secondary Odd Chip Select is being used. Use the
Secondary Odd Address Mask when calculating the chip select size.

 [ bp: move csrow_sec_enabled() to the header, fix CS_ODD define and
   tone-down the capitalized words spelling. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-8-Yazen.Ghannam@amd.com
2019-08-23 16:09:52 +02:00
Yazen Ghannam 7574729e91 EDAC/amd64: Cache secondary Chip Select registers
AMD Family 17h systems have a set of secondary Chip Select Base
Addresses and Address Masks. These do not represent unique Chip
Selects, rather they are used in conjunction with the primary
Chip Select registers in certain cases.

Cache these secondary Chip Select registers for future use.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-7-Yazen.Ghannam@amd.com
2019-08-23 12:55:05 +02:00
Yazen Ghannam 8a2eaab7da EDAC/amd64: Decode syndrome before translating address
AMD Family 17h systems currently require address translation in order to
report the system address of a DRAM ECC error. This is currently done
before decoding the syndrome information. The syndrome information does
not depend on the address translation, so the proper EDAC csrow/channel
reporting can function without the address. However, the syndrome
information will not be decoded if the address translation fails.

Decode the syndrome information before doing the address translation.
The syndrome information is architecturally defined in MCA_SYND and can
be considered robust. The address translation is system-specific and may
fail on newer systems without proper updates to the translation
algorithm.

Fixes: 713ad54675 ("EDAC, amd64: Define and register UMC error decode function")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-6-Yazen.Ghannam@amd.com
2019-08-23 07:50:21 +02:00
Yazen Ghannam e53a3b267f EDAC/amd64: Find Chip Select memory size using Address Mask
Chip Select memory size reporting on AMD Family 17h was recently fixed
in order to account for interleaving. However, the current method is not
robust.

The Chip Select Address Mask can be used to find the memory size. There
are a couple of cases.

1) For single-rank and dual-rank non-interleaved, use the address mask
plus 1 as the size.

2) For dual-rank interleaved, do #1 but "de-interleave" the address mask
first.

Always "de-interleave" the address mask in order to simplify the code
flow. Bit mask manipulation is necessary to check for interleaving, so
just go ahead and do the de-interleaving. In the non-interleaved case,
the original and de-interleaved address masks will be the same.

To de-interleave the mask, count the number of zero bits in the middle
of the mask and swap them with the most significant bits.

For example,
Original=0xFFFF9FE, De-interleaved=0x3FFFFFE

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-5-Yazen.Ghannam@amd.com
2019-08-23 07:23:53 +02:00
Yazen Ghannam 353a1fcb8f EDAC/amd64: Initialize DIMM info for systems with more than two channels
Currently, the DIMM info for AMD Family 17h systems is initialized in
init_csrows(). This function is shared with legacy systems, and it has a
limit of two channel support.

This prevents initialization of the DIMM info for a number of ranks, so
there will be missing ranks in the EDAC sysfs.

Create a new init_csrows_df() for Family17h+ and revert init_csrows()
back to pre-Family17h support.

Loop over all channels in the new function in order to support systems
with more than two channels.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-4-Yazen.Ghannam@amd.com
2019-08-23 07:06:29 +02:00
Yazen Ghannam f8be8e5680 EDAC/amd64: Recognize DRAM device type ECC capability
AMD Family 17h systems support x4 and x16 DRAM devices. However, the
device type is not checked when setting mci.edac_ctl_cap.

Set the appropriate capability flag based on the device type.

Default to x8 DRAM device when neither the x4 or x16 bits are set.

 [ bp: reverse cpk_en check to save an indentation level. ]

Fixes: 2d09d8f301 ("EDAC, amd64: Determine EDAC MC capabilities on Fam17h")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-3-Yazen.Ghannam@amd.com
2019-08-23 06:58:31 +02:00
Yazen Ghannam d971e28e2c EDAC/amd64: Support more than two controllers for chip selects handling
The struct chip_select array that's used for saving chip select bases
and masks is fixed at length of two. There should be one struct
chip_select for each controller, so this array should be increased to
support systems that may have more than two controllers.

Increase the size of the struct chip_select array to eight, which is the
largest number of controllers per die currently supported on AMD
systems.

Fix number of DIMMs and Chip Select bases/masks on Family17h, because
AMD Family 17h systems support 2 DIMMs, 4 CS bases, and 2 CS masks per
channel.

Also, carve out the Family 17h+ reading of the bases/masks into a
separate function. This effectively reverts the original bases/masks
reading code to before Family 17h support was added.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-2-Yazen.Ghannam@amd.com
2019-08-22 19:08:49 +02:00
Robert Richter 718d58514e EDAC/mc: Cleanup _edac_mc_free() code
Remove needless and boilerplate variable declarations. No functional
changes.

 [ bp: Add newlines for better readability. ]

Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190624150758.6695-10-rrichter@marvell.com
2019-08-14 18:27:00 +02:00
Stephen Douthit 29a3388bfc EDAC, pnd2: Fix ioremap() size in dnv_rd_reg()
Depending on how BIOS has marked the reserved region containing the 32KB
MCHBAR you can get warnings like:

resource sanity check: requesting [mem 0xfed10000-0xfed1ffff], which spans more than reserved [mem 0xfed10000-0xfed17fff]
caller dnv_rd_reg+0xc8/0x240 [pnd2_edac] mapping multiple BARs

Not all of the mmio regions used in dnv_rd_reg() are the same size.  The
MCHBAR window is 32KB and the sideband ports are 64KB.  Pass the correct
size to ioremap() depending on which resource we're reading from.

Signed-off-by: Stephen Douthit <stephend@silicom-usa.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2019-08-09 10:19:07 -07:00
Shravan Kumar Ramani 82413e562e EDAC, mellanox: Add ECC support for BlueField DDR4
Add ECC support for Mellanox BlueField SoC DDR controller.
This requires SMC to the running Arm Trusted Firmware to report
what is the current memory configuration.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Shravan Kumar Ramani <sramani@mellanox.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-08-08 12:57:01 -03:00
Dan Carpenter 8faa1cf6ed EDAC/altera: Use the proper type for the IRQ status bits
Smatch complains about the cast of a u32 pointer to unsigned long:

  drivers/edac/altera_edac.c:1878 altr_edac_a10_irq_handler()
  warn: passing casted pointer '&irq_status' to 'find_first_bit()'

This code wouldn't work on a 64 bit big endian system because it would
read past the end of &irq_status.

 [ bp: massage. ]

Fixes: 13ab8448d2 ("EDAC, altera: Add ECC Manager IRQ controller support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: James Morse <james.morse@arm.com>
Cc: kernel-janitors@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190624134717.GA1754@mwanda
2019-08-07 10:37:34 +02:00
Robert Richter 3724ace582 EDAC/mc: Fix grain_bits calculation
The grain in EDAC is defined as "minimum granularity for an error
report, in bytes". The following calculation of the grain_bits in
edac_mc is wrong:

	grain_bits = fls_long(e->grain) + 1;

Where grain_bits is defined as:

	grain = 1 << grain_bits

Example:

	grain = 8	# 64 bit (8 bytes)
	grain_bits = fls_long(8) + 1
	grain_bits = 4 + 1 = 5

	grain = 1 << grain_bits
	grain = 1 << 5 = 32

Replace it with the correct calculation:

	grain_bits = fls_long(e->grain - 1);

The example gives now:

	grain_bits = fls_long(8 - 1)
	grain_bits = fls_long(7)
	grain_bits = 3

	grain = 1 << 3 = 8

Also, check if the hardware reports a reasonable grain != 0 and fallback
with a warning to 1 byte granularity otherwise.

 [ bp: massage a bit. ]

Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190624150758.6695-2-rrichter@marvell.com
2019-08-03 12:05:51 +02:00
Thor Thayer 3123c5c4ca edac: altera: Move Stratix10 SDRAM ECC to peripheral
ARM32 SoCFPGAs had separate IRQs for SDRAM. ARM64 SoCFPGAs
send all DBEs to SError so filtering by source is necessary.

The Stratix10 SDRAM ECC is a better match with the generic
Altera peripheral ECC framework because the linked list can
be searched to find the ECC block offset and printout
the DBE Address.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Acked-by: James Morse <james.morse@arm.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-25 14:28:42 -04:00
Eiichi Tsukata d8655e7630 EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec
Commit 9da21b1509 ("EDAC: Poll timeout cannot be zero, p2") assumes
edac_mc_poll_msec to be unsigned long, but the type of the variable still
remained as int. Setting edac_mc_poll_msec can trigger out-of-bounds
write.

Reproducer:

  # echo 1001 > /sys/module/edac_core/parameters/edac_mc_poll_msec

KASAN report:

  BUG: KASAN: global-out-of-bounds in edac_set_poll_msec+0x140/0x150
  Write of size 8 at addr ffffffffb91b2d00 by task bash/1996

  CPU: 1 PID: 1996 Comm: bash Not tainted 5.2.0-rc6+ #23
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
  Call Trace:
   dump_stack+0xca/0x13e
   print_address_description.cold+0x5/0x246
   __kasan_report.cold+0x75/0x9a
   ? edac_set_poll_msec+0x140/0x150
   kasan_report+0xe/0x20
   edac_set_poll_msec+0x140/0x150
   ? dimmdev_location_show+0x30/0x30
   ? vfs_lock_file+0xe0/0xe0
   ? _raw_spin_lock+0x87/0xe0
   param_attr_store+0x1b5/0x310
   ? param_array_set+0x4f0/0x4f0
   module_attr_store+0x58/0x80
   ? module_attr_show+0x80/0x80
   sysfs_kf_write+0x13d/0x1a0
   kernfs_fop_write+0x2bc/0x460
   ? sysfs_kf_bin_read+0x270/0x270
   ? kernfs_notify+0x1f0/0x1f0
   __vfs_write+0x81/0x100
   vfs_write+0x1e1/0x560
   ksys_write+0x126/0x250
   ? __ia32_sys_read+0xb0/0xb0
   ? do_syscall_64+0x1f/0x390
   do_syscall_64+0xc1/0x390
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7fa7caa5e970
  Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 04
  RSP: 002b:00007fff6acfdfe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fa7caa5e970
  RDX: 0000000000000005 RSI: 0000000000e95c08 RDI: 0000000000000001
  RBP: 0000000000e95c08 R08: 00007fa7cad1e760 R09: 00007fa7cb36a700
  R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000005
  R13: 0000000000000001 R14: 00007fa7cad1d600 R15: 0000000000000005

  The buggy address belongs to the variable:
   edac_mc_poll_msec+0x0/0x40

  Memory state around the buggy address:
   ffffffffb91b2c00: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa
   ffffffffb91b2c80: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa
  >ffffffffb91b2d00: 04 fa fa fa fa fa fa fa 04 fa fa fa fa fa fa fa
                     ^
   ffffffffb91b2d80: 04 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
   ffffffffb91b2e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Fix it by changing the type of edac_mc_poll_msec to unsigned int.
The reason why this patch adopts unsigned int rather than unsigned long
is msecs_to_jiffies() assumes arg to be unsigned int. We can avoid
integer conversion bugs and unsigned int will be large enough for
edac_mc_poll_msec.

Reviewed-by: James Morse <james.morse@arm.com>
Fixes: 9da21b1509 ("EDAC: Poll timeout cannot be zero, p2")
Signed-off-by: Eiichi Tsukata <devel@etsukata.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2019-06-27 10:24:47 -07:00
Qiuxu Zhuo 1dc78f1ffa EDAC, skx, i10nm: Fix source ID register offset
The source ID register offset for Skylake server is 0xf0, while for
Icelake server is 0xf8. Pass the correct offset to get the source ID.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2019-06-26 10:07:27 -07:00
Qiuxu Zhuo c4a1dd9e83 EDAC, i10nm: Check ECC enabling status per channel
The i10nm_edac only checks the ECC enabling status for the first
channel of the memory controller. If there aren't memory DIMMs
populated on the first channel, but at least one DIMM populated
on the second channel, it will wrongly report that the ECC for
the memory controller is disabled that fails to load the i10nm_edac
driver. Fix it by checking ECC enabling status per channel.

[Tony: Also report which channel has ECC disabled]

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2019-06-26 10:06:09 -07:00
Qiuxu Zhuo 5c5d3ac206 EDAC, i10nm: Add Intel additional Ice-Lake support
Two new CPU models share the same memory controller
architecture with Jacobsville/Tremont, so can use the
same i10nm EDAC driver.

Add ICX and ICX-D CPU model numbers for EDAC support.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2019-06-20 11:44:36 -07:00
Greg Kroah-Hartman 40d7962434 EDAC: Make edac_debugfs_create_x*() return void
The return values of edac_debugfs_create_x16() and
edac_debugfs_create_x8() are never checked (as they don't need to be),
so no need to have them return anything, just make the functions return
void instead.

This is done with the goal of being able to change the debugfs_create_x*
functions to also not return a value.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: James Morse <james.morse@arm.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lkml.kernel.org/r/20190611175433.GA5108@kroah.com
2019-06-20 11:44:36 -07:00
YueHaibing ff70cacc96 EDAC/aspeed: Remove set but not used variable 'np'
Fix the following -Wunused-but-set-variable warning:

  drivers/edac/aspeed_edac.c: In function aspeed_probe:
  drivers/edac/aspeed_edac.c:284:22: warning: variable np set but not used [-Wunused-but-set-variable]

It is never used and can be removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Stefan Schaeckeler <sschaeck@cisco.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-aspeed@lists.ozlabs.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lkml.kernel.org/r/20190525144153.2028-1-yuehaibing@huawei.com
2019-06-20 11:44:36 -07:00
Marco Elver 4d91fde8d5 EDAC/ie31200: Reformat PCI device table
Reformat device table after Coffee Lake additions to be more readable.

No functional change.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lkml.kernel.org/r/20190610191422.177931-2-elver@google.com
2019-06-20 11:44:36 -07:00
Marco Elver c452a9d30f EDAC/ie31200: Add Intel Coffee Lake CPU support
Coffee Lake seems to work like Skylake and Kaby Lake. Add all device IDs
for Coffee Lake-S CPUs according to datasheet.

 [ bp: Massage. ]

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lkml.kernel.org/r/20190610191422.177931-1-elver@google.com
2019-06-20 11:44:36 -07:00
Yash Shah 91abaeaaff EDAC/sifive: Add EDAC platform driver for SiFive SoCs
Add an EDAC driver for SiFive SoCs. The initial version supports ECC
event monitoring and reporting through the EDAC framework for the SiFive
L2 cache controller. It registers for notifier events from the L2 cache
controller driver (arch/riscv/mm/sifive_l2_cache.c) for L2 ECC events.

 [ bp: Massage commit message. ]

Signed-off-by: Yash Shah <yash.shah@sifive.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-riscv@lists.infradead.org
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: sachin.ghadi@sifive.com
Link: https://lkml.kernel.org/r/1557142026-15949-2-git-send-email-yash.shah@sifive.com
2019-06-20 11:44:36 -07:00
Colin Ian King 0042e9e7a5 EDAC/sb_edac: Remove redundant update of tad_base
The variable tad_base is being set to a value that is never read and is
being over-written on the next iteration of a for-loop. This assignment
is therefore redundant and can be removed.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: James Morse <james.morse@arm.com>
Cc: kernel-janitors@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lkml.kernel.org/r/20190508224201.27120-1-colin.king@canonical.com
2019-06-20 11:44:36 -07:00
Thor Thayer a428b4d348 EDAC/altera: Add Stratix10 SDMMC support
Add SDMMC support for Stratix10 which has IRQ differences from Arria10.
Update comment accordingly.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: devicetree@vger.kernel.org
Cc: dinguyen@kernel.org
Cc: James Morse <james.morse@arm.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: robh+dt@kernel.org
Link: https://lkml.kernel.org/r/1556030197-24534-4-git-send-email-thor.thayer@linux.intel.com
2019-06-20 11:44:36 -07:00
Thor Thayer 17e47dc6db EDAC/altera: Add Stratix10 OCRAM ECC support
Use the newer ECC error injection method for Arria10 and Stratix10
OCRAM. If OCRAM has already been initialized during boot and OCRAM ECC
is enabled, ensure the Single Bit Error IRQ is enabled.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: devicetree@vger.kernel.org
Cc: dinguyen@kernel.org
Cc: James Morse <james.morse@arm.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: robh+dt@kernel.org
Link: https://lkml.kernel.org/r/1556030197-24534-2-git-send-email-thor.thayer@linux.intel.com
2019-06-20 11:44:36 -07:00
Greg KH 7adc05d2dc EDAC/sysfs: Drop device references properly
Do put_device() if device_add() fails.

 [ bp: do device_del() for the successfully created devices in
   edac_create_csrow_objects(), on the unwind path. ]

Signed-off-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20190427214925.GE16338@kroah.com
2019-06-20 11:44:36 -07:00
Pan Bian 585fb3d93d EDAC/sysfs: Fix memory leak when creating a csrow object
In edac_create_csrow_object(), the reference to the object is not
released when adding the device to the device hierarchy fails
(device_add()). This may result in a memory leak.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/1555554438-103953-1-git-send-email-bianpan2016@163.com
2019-06-20 11:44:36 -07:00
Thomas Gleixner b886d83c5b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation version 2 of the license

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 315 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:17 +02:00
Thomas Gleixner 4505153954 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation inc 59 temple place suite 330 boston ma 02111
  1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 136 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190530000436.384967451@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:06 +02:00
Thomas Gleixner 2025cf9e19 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 263 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Thomas Gleixner 9952f6918d treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not see http www gnu org
  licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 228 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528171438.107155473@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:29:52 -07:00
Thomas Gleixner 122375508b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 172
Based on 1 normalized pattern(s):

  this file may be distributed under the terms of the gnu general
  public license version 2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 9 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070034.395589349@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:39 -07:00
Thomas Gleixner 1ccea77e2a treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details you
  should have received a copy of the gnu general public license along
  with this program if not see http www gnu org licenses

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details [based]
  [from] [clk] [highbank] [c] you should have received a copy of the
  gnu general public license along with this program if not see http
  www gnu org licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 355 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 11:28:45 +02:00
Thomas Gleixner 09c434b8a0 treewide: Add SPDX license identifier for more missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have MODULE_LICENCE("GPL*") inside which was used in the initial
   scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Thomas Gleixner 457c899653 treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Linus Torvalds 11b1177399 * Do not build mpc85_edac as a module (Michael Ellerman)
* Correct edac_mc_find()'s return value on error	(Robert Richter)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAlzdOjIACgkQEsHwGGHe
 VUrkxQ/+J+mSL23ush4HNq43OqW7Wgpkdoljo+5hr1XWH790Kc/HNI+B/Qdsgl6I
 UoFkJPwMTfxqV/JZFuNC22ADnrXw6nRDZSE7+K+fHmpBgj2rK4hQ6tOYON9dTg/X
 9nPF1ylBQcdzwNt4eTTnS5U1eKBSWqmQm5NtYmq2s83DQ1v3r2Pckqx00SC5Stdy
 9vQLCPUR5SjsGpeQZnrdesxO89t3pWWzfZrAsCxPiY36VbUVmsHQVbHC9yfpLSIE
 6pzc0esQx6zqyb4e0Rx+aKJPJ6fS43NsxifkI81h7S5F/xPEuZy39rhAmnEOYCNE
 0ceQwl+ktzzGzmurgXf9Y5LLy8vOcCUJX4HBCkaweHSDRiP1xZtIl7Ao3aAaP4no
 AUQqX/O0BydkO+awarhCduM7vIGdr0B9xn0xCtUODitoTIzaoVZ8F2iF7xSOi/3F
 QkP4dSsf72ZK/EdA/8k1bxyBnxQ/TKgwLIQEJ1n9Bz7biPxZmqCyHQMDg7HaA1Uc
 crGkJJr0cGHb9TEAzy0ygtRNrKE+hpt0FGFGE0zlE4uX7YP1Y/Bw3R38zbcl2x9h
 nIqEmHwswVfBdm28RfxRH3LnS9SHwsXTtUk0y89Je1pK0qZ2Kv6nRiEFzWIAqPyY
 I/7eorSkNXU+vVnZko8cvuHEuJh1VUj2bGJeS6NUo9X5+IOE0LE=
 =yjXI
 -----END PGP SIGNATURE-----

Merge tag 'edac_fixes_for_5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC fixes from Borislav Petkov:

 - Do not build mpc85_edac as a module (Michael Ellerman)

 - Correct edac_mc_find()'s return value on error (Robert Richter)

* tag 'edac_fixes_for_5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC/mc: Fix edac_mc_find() in case no device is found
  EDAC/mpc85xx: Prevent building as a module
2019-05-16 11:55:35 -07:00
Robert Richter 29a0c84397 EDAC/mc: Fix edac_mc_find() in case no device is found
The function should return NULL in case no device is found, but it
always returns the last checked mc device from the list even if the
index did not match. Fix that.

I did some analysis why this did not raise any issues for about 3 years
and the reason is that edac_mc_find() is mostly used to search for
existing devices. Thus, the bug is not triggered.

 [ bp: Drop the if (mci->mc_idx > idx) test in favor of readability. ]

Fixes: c73e8833be ("EDAC, mc: Fix locking around mc_devices list")
Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Link: https://lkml.kernel.org/r/20190514104838.15065-1-rrichter@marvell.com
2019-05-14 17:08:46 +02:00
Michael Ellerman 2b8358a951 EDAC/mpc85xx: Prevent building as a module
The mpc85xx EDAC driver can be configured as a module but then fails to
build because it uses two unexported symbols:

  ERROR: ".pci_find_hose_for_OF_device" [drivers/edac/mpc85xx_edac_mod.ko] undefined!
  ERROR: ".early_find_capability" [drivers/edac/mpc85xx_edac_mod.ko] undefined!

We don't want to export those symbols just for this driver, so make the
driver only configurable as a built-in.

This seems to have been broken since at least

  c92132f598 ("edac/85xx: Add PCIe error interrupt edac support")

(Nov 2013).

 [ bp: make it depend on EDAC=y so that the EDAC core doesn't get built
   as a module. ]

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linuxppc-dev@ozlabs.org
Cc: morbidrsa@gmail.com
Link: https://lkml.kernel.org/r/20190502141941.12927-1-mpe@ellerman.id.au
2019-05-10 20:15:11 +02:00
Linus Torvalds ffa6f55eb6 Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:

 - Support for varying MCA bank numbers per CPU: this is in preparation
   for future CPU enablement (Yazen Ghannam)

 - MCA banks read race fix (Tony Luck)

 - Facility to filter MCEs which should not be logged (Yazen Ghannam)

 - The usual round of cleanups and fixes

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h models
  x86/MCE: Add an MCE-record filtering function
  RAS/CEC: Increment cec_entered under the mutex lock
  x86/mce: Fix debugfs_simple_attr.cocci warnings
  x86/mce: Remove mce_report_event()
  x86/mce: Handle varying MCA bank counts
  x86/mce: Fix machine_check_poll() tests for error types
  MAINTAINERS: Fix file pattern for X86 MCE INFRASTRUCTURE
  x86/MCE: Group AMD function prototypes in <asm/mce.h>
2019-05-06 19:54:57 -07:00
Borislav Petkov 8de9930a46 Revert "EDAC/amd64: Support more than two controllers for chip select handling"
This reverts commit 0a227af521.

Unfortunately, this commit caused wrong detection of chip select sizes
on some F17h client machines:

  --- 00-rc6+     2019-02-14 14:28:03.126622904 +0100
  +++ 01-rc4+     2019-04-14 21:06:16.060614790 +0200
   EDAC amd64: MC: 0:     0MB 1:     0MB
  -EDAC amd64: MC: 2: 16383MB 3: 16383MB
  +EDAC amd64: MC: 2:     0MB 3: 2097151MB
   EDAC amd64: MC: 4:     0MB 5:     0MB
   EDAC amd64: MC: 6:     0MB 7:     0MB
   EDAC MC: UMC1 chip selects:
   EDAC amd64: MC: 0:     0MB 1:     0MB
  -EDAC amd64: MC: 2: 16383MB 3: 16383MB
  +EDAC amd64: MC: 2:     0MB 3: 2097151MB
   EDAC amd64: MC: 4:     0MB 5:     0MB
   EDAC amd64: MC: 6:     0MB 7:     0M

Revert it for now until it has been solved properly.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
2019-04-25 16:30:34 +02:00
Yazen Ghannam 71a84402b9 x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h models
AMD family 17h Models 10h-2Fh may report a high number of L1 BTB MCA
errors under certain conditions. The errors are benign and can safely be
ignored. However, the high error rate may cause the MCA threshold
counter to overflow causing a high rate of thresholding interrupts.

In addition, users may see the errors reported through the AMD MCE
decoder module, even with the interrupt disabled, due to MCA polling.

Clear the "Counter Present" bit in the Instruction Fetch bank's
MCA_MISC0 register. This will prevent enabling MCA thresholding on this
bank which will prevent the high interrupt rate due to this error.

Define an AMD-specific function to filter these errors from the MCE
event pool so that they don't get reported during early boot.

Rename filter function in EDAC/mce_amd to avoid a naming conflict, while
at it.

 [ bp: Move function prototype to the internal header and
   massage/cleanup, fix typos. ]

Reported-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "clemej@gmail.com" <clemej@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Shirish S <Shirish.S@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org> # 5.0.x: c95b323dcd35: x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models
Cc: <stable@vger.kernel.org> # 5.0.x: 30aa3d26edb0: x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk
Cc: <stable@vger.kernel.org> # 5.0.x: 9308fd407455: x86/MCE: Group AMD function prototypes in <asm/mce.h>
Cc: <stable@vger.kernel.org> # 5.0.x
Link: https://lkml.kernel.org/r/20190325163410.171021-2-Yazen.Ghannam@amd.com
2019-04-23 18:16:07 +02:00
Thor Thayer fad9fab975 EDAC/altera, firmware/intel: Add Stratix10 ECC DBE SMC call
Reserve ECC Double Bit Error SMC call to alert U-Boot that a DBE has
occurred. Move the call from local EDAC header file to a common header.

 [ bp: Merge the two patches. ]

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Richard Gong <richard.gong@intel.com>
Reviewed-by: Alan Tull <atull@kernel.org> # firmware
Cc: Greg KH <greg@kroah.com>
Cc: James Morse <james.morse@arm.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mchehab@kernel.org
Link: https://lkml.kernel.org/r/1553870639-23895-1-git-send-email-thor.thayer@linux.intel.com

Signed-off-by: Borislav Petkov <bp@suse.de>
2019-04-02 17:42:15 +02:00
Thor Thayer 788586efd1 EDAC/altera: Initialize peripheral FIFOs in probe()
The FIFO memory and ECC initialization doesn't need to be
done as a separate operation early in the startup.

Improve the Arria10 and Stratix10 peripheral FIFO init
by initializing memory and enabling ECC as part of the
device driver initialization.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/1553635771-32693-2-git-send-email-thor.thayer@linux.intel.com
2019-03-29 11:46:23 +01:00
Thor Thayer 436b0a583a EDAC/altera: Do less intrusive error injection
Improve the Arria10 and Stratix10 error injection routine
by reading the data and changing just 1 bit before writing
back out. Previous routine would overwrite the first bytes
to 0 then change 1 bit but this method is less intrusive.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: https://lkml.kernel.org/r/1553635771-32693-1-git-send-email-thor.thayer@linux.intel.com
2019-03-29 11:35:39 +01:00