1
0
Fork 0
Commit Graph

36352 Commits (fc387d8daf3960c5e1bc18fa353768056f4fd394)

Author SHA1 Message Date
Linus Torvalds 3950e97543 Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull execve updates from Eric Biederman:
 "During the development of v5.7 I ran into bugs and quality of
  implementation issues related to exec that could not be easily fixed
  because of the way exec is implemented. So I have been diggin into
  exec and cleaning up what I can.

  This cycle I have been looking at different ideas and different
  implementations to see what is possible to improve exec, and cleaning
  the way exec interfaces with in kernel users. Only cleaning up the
  interfaces of exec with rest of the kernel has managed to stabalize
  and make it through review in time for v5.9-rc1 resulting in 2 sets of
  changes this cycle.

   - Implement kernel_execve

   - Make the user mode driver code a better citizen

  With kernel_execve the code size got a little larger as the copying of
  parameters from userspace and copying of parameters from userspace is
  now separate. The good news is kernel threads no longer need to play
  games with set_fs to use exec. Which when combined with the rest of
  Christophs set_fs changes should security bugs with set_fs much more
  difficult"

* 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits)
  exec: Implement kernel_execve
  exec: Factor bprm_stack_limits out of prepare_arg_pages
  exec: Factor bprm_execve out of do_execve_common
  exec: Move bprm_mm_init into alloc_bprm
  exec: Move initialization of bprm->filename into alloc_bprm
  exec: Factor out alloc_bprm
  exec: Remove unnecessary spaces from binfmts.h
  umd: Stop using split_argv
  umd: Remove exit_umh
  bpfilter: Take advantage of the facilities of struct pid
  exit: Factor thread_group_exited out of pidfd_poll
  umd: Track user space drivers with struct pid
  bpfilter: Move bpfilter_umh back into init data
  exec: Remove do_execve_file
  umh: Stop calling do_execve_file
  umd: Transform fork_usermode_blob into fork_usermode_driver
  umd: Rename umd_info.cmdline umd_info.driver_name
  umd: For clarity rename umh_info umd_info
  umh: Separate the user mode driver and the user mode helper support
  umh: Remove call_usermodehelper_setup_file.
  ...
2020-08-04 14:27:25 -07:00
Linus Torvalds 99ea1521a0 Remove uninitialized_var() macro for v5.9-rc1
- Clean up non-trivial uses of uninitialized_var()
 - Update documentation and checkpatch for uninitialized_var() removal
 - Treewide removal of uninitialized_var()
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oYLQWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsfjEACvf0D3WL3H7sLHtZ2HeMwOgAzq
 il08t6vUscINQwiIIK3Be43ok3uQ1Q+bj8sr2gSYTwunV2IYHFferzgzhyMMno3o
 XBIGd1E+v1E4DGBOiRXJvacBivKrfvrdZ7AWiGlVBKfg2E0fL1aQbe9AYJ6eJSbp
 UGqkBkE207dugS5SQcwrlk1tWKUL089lhDAPd7iy/5RK76OsLRCJFzIerLHF2ZK2
 BwvA+NWXVQI6pNZ0aRtEtbbxwEU4X+2J/uaXH5kJDszMwRrgBT2qoedVu5LXFPi8
 +B84IzM2lii1HAFbrFlRyL/EMueVFzieN40EOB6O8wt60Y4iCy5wOUzAdZwFuSTI
 h0xT3JI8BWtpB3W+ryas9cl9GoOHHtPA8dShuV+Y+Q2bWe1Fs6kTl2Z4m4zKq56z
 63wQCdveFOkqiCLZb8s6FhnS11wKtAX4czvXRXaUPgdVQS1Ibyba851CRHIEY+9I
 AbtogoPN8FXzLsJn7pIxHR4ADz+eZ0dQ18f2hhQpP6/co65bYizNP5H3h+t9hGHG
 k3r2k8T+jpFPaddpZMvRvIVD8O2HvJZQTyY6Vvneuv6pnQWtr2DqPFn2YooRnzoa
 dbBMtpon+vYz6OWokC5QNWLqHWqvY9TmMfcVFUXE4AFse8vh4wJ8jJCNOFVp8On+
 drhmmImUr1YylrtVOw==
 =xHmk
 -----END PGP SIGNATURE-----

Merge tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull uninitialized_var() macro removal from Kees Cook:
 "This is long overdue, and has hidden too many bugs over the years. The
  series has several "by hand" fixes, and then a trivial treewide
  replacement.

   - Clean up non-trivial uses of uninitialized_var()

   - Update documentation and checkpatch for uninitialized_var() removal

   - Treewide removal of uninitialized_var()"

* tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  compiler: Remove uninitialized_var() macro
  treewide: Remove uninitialized_var() usage
  checkpatch: Remove awareness of uninitialized_var() macro
  mm/debug_vm_pgtable: Remove uninitialized_var() usage
  f2fs: Eliminate usage of uninitialized_var() macro
  media: sur40: Remove uninitialized_var() usage
  KVM: PPC: Book3S PR: Remove uninitialized_var() usage
  clk: spear: Remove uninitialized_var() usage
  clk: st: Remove uninitialized_var() usage
  spi: davinci: Remove uninitialized_var() usage
  ide: Remove uninitialized_var() usage
  rtlwifi: rtl8192cu: Remove uninitialized_var() usage
  b43: Remove uninitialized_var() usage
  drbd: Remove uninitialized_var() usage
  x86/mm/numa: Remove uninitialized_var() usage
  docs: deprecated.rst: Add uninitialized_var()
2020-08-04 13:49:43 -07:00
Linus Torvalds 0408497800 Power management updates for 5.9-rc1
- Make the Energy Model cover non-CPU devices (Lukasz Luba).
 
  - Add Ice Lake server idle states table to the intel_idle driver
    and eliminate a redundant static variable from it (Chen Yu,
    Rafael Wysocki).
 
  - Eliminate all W=1 build warnings from cpufreq (Lee Jones).
 
  - Add support for Sapphire Rapids and for Power Limit 4 to the
    Intel RAPL power capping driver (Sumeet Pawnikar, Zhang Rui).
 
  - Fix function name in kerneldoc comments in the idle_inject power
    capping driver (Yangtao Li).
 
  - Fix locking issues with cpufreq governors and drop a redundant
    "weak" function definition from cpufreq (Viresh Kumar).
 
  - Rearrange cpufreq to register non-modular governors at the
    core_initcall level and allow the default cpufreq governor to
    be specified in the kernel command line (Quentin Perret).
 
  - Extend, fix and clean up the intel_pstate driver (Srinivas
    Pandruvada, Rafael Wysocki):
 
    * Add a new sysfs attribute for disabling/enabling CPU
      energy-efficiency optimizations in the processor.
 
    * Make the driver avoid enabling HWP if EPP is not supported.
 
    * Allow the driver to handle numeric EPP values in the sysfs
      interface and fix the setting of EPP via sysfs in the active
      mode.
 
    * Eliminate a static checker warning and clean up a kerneldoc
      comment.
 
  - Clean up some variable declarations in the powernv cpufreq
    driver (Wei Yongjun).
 
  - Fix up the ->enter_s2idle callback definition to cover the case
    when it points to the same function as ->idle correctly (Neal
    Liu).
 
  - Rearrange and clean up the PSCI cpuidle driver (Ulf Hansson).
 
  - Make the PM core emit "changed" uevent when adding/removing the
    "wakeup" sysfs attribute of devices (Abhishek Pandit-Subedi).
 
  - Add a helper macro for declaring PM callbacks and use it in the
    MMC jz4740 driver (Paul Cercueil).
 
  - Fix white space in some places in the hibernate code and make the
    system-wide PM code use "const char *" where appropriate (Xiang
    Chen, Alexey Dobriyan).
 
  - Add one more "unsafe" helper macro to the freezer to cover the NFS
    use case (He Zhe).
 
  - Change the language in the generic PM domains framework to use
    parent/child terminology and clean up a typo and some comment
    fromatting in that code (Kees Cook, Geert Uytterhoeven).
 
  - Update the operating performance points OPP framework (Lukasz
    Luba, Andrew-sh.Cheng, Valdis Kletnieks):
 
    * Refactor dev_pm_opp_of_register_em() and update related drivers.
 
    * Add a missing function export.
 
    * Allow disabled OPPs in dev_pm_opp_get_freq().
 
  - Update devfreq core and drivers (Chanwoo Choi, Lukasz Luba, Enric
    Balletbo i Serra, Dmitry Osipenko, Kieran Bingham, Marc Zyngier):
 
    * Add support for delayed timers to the devfreq core and make the
      Samsung exynos5422-dmc driver use it.
 
    * Unify sysfs interface to use "df-" as a prefix in instance names
      consistently.
 
    * Fix devfreq_summary debugfs node indentation.
 
    * Add the rockchip,pmu phandle to the rk3399_dmc driver DT
      bindings.
 
    * List Dmitry Osipenko as the Tegra devfreq driver maintainer.
 
    * Fix typos in the core devfreq code.
 
  - Update the pm-graph utility to version 5.7 including a number of
    fixes related to suspend-to-idle (Todd Brandt).
 
  - Fix coccicheck errors and warnings in the cpupower utility (Shuah
    Khan).
 
  - Replace HTTP links with HTTPs ones in multiple places (Alexander
    A. Klimov).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl8oO24SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx7ZQP/0lQ0yABnASnwomdOH6+K/m7rvc+e9FE
 zx5pTDQswhU5tM7SQAIKqe0uSI+okF2UrBrT5onA16F+JUbnrbexJLazBPfVTTGF
 AKpKEQ7Wh69Wz+Y6cQZjm1dTuRL+dlBJuBrzR2tLSnONPMMHuFcO3xd7lgE9UAxC
 oGEf393taA6OqcUNRQIa2gqbq+k1qhKjeDucGkbOaoJ6CL0ZyWI+Tfw1WWaBBGv0
 /2wBd6V513OH8WtQCW6H3YpHmhYW6OwL8w19KyGcjPRGJaeaIP4W/Ng7mkvgL5ZB
 vZqg3XiufFV9uTe8W1NQaVv/NjlN256OteuK809aosTVjD0dhFkhBYg5TLu6HbQq
 C/NciZ+78oLedWLT73EUfw3NyS+V0jk6X2EIlBUwNi0Qw1B1pCifGOCKzWFFe5cr
 ci4xr4FG7dBkxScOxwFAU2s5TdPHLOkGkQtg4jZr0OYDrzkyLEdsnZEUjLPORo+0
 6EBXGfTOSy2CBHcYswRtzJr/1pUTzj7oejhTAMCCuYW2r3VyQtnYcVjlehtp20if
 6BfmGisk8nmtxlSm+/Y2FqKa4bNnSTMmr0UJQ+Rjp0tHs47QeucI0ORfZ5nPaBac
 +ptvIjWmn3xejT/+oAehpH9066Iuy66vzHdnj7x5+WAsmYS8n8OFtlBFkYELmLJB
 3xI5hIl7WtGo
 =8cUO
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "The most significant change here is the extension of the Energy Model
  to cover non-CPU devices (as well as CPUs) from Lukasz Luba.

  There is also some new hardware support (Ice Lake server idle states
  table for intel_idle, Sapphire Rapids and Power Limit 4 support in the
  RAPL driver), some new functionality in the existing drivers (eg. a
  new switch to disable/enable CPU energy-efficiency optimizations in
  intel_pstate, delayed timers in devfreq), some assorted fixes (cpufreq
  core, intel_pstate, intel_idle) and cleanups (eg. cpuidle-psci,
  devfreq), including the elimination of W=1 build warnings from cpufreq
  done by Lee Jones.

  Specifics:

   - Make the Energy Model cover non-CPU devices (Lukasz Luba).

   - Add Ice Lake server idle states table to the intel_idle driver and
     eliminate a redundant static variable from it (Chen Yu, Rafael
     Wysocki).

   - Eliminate all W=1 build warnings from cpufreq (Lee Jones).

   - Add support for Sapphire Rapids and for Power Limit 4 to the Intel
     RAPL power capping driver (Sumeet Pawnikar, Zhang Rui).

   - Fix function name in kerneldoc comments in the idle_inject power
     capping driver (Yangtao Li).

   - Fix locking issues with cpufreq governors and drop a redundant
     "weak" function definition from cpufreq (Viresh Kumar).

   - Rearrange cpufreq to register non-modular governors at the
     core_initcall level and allow the default cpufreq governor to be
     specified in the kernel command line (Quentin Perret).

   - Extend, fix and clean up the intel_pstate driver (Srinivas
     Pandruvada, Rafael Wysocki):

       * Add a new sysfs attribute for disabling/enabling CPU
         energy-efficiency optimizations in the processor.

       * Make the driver avoid enabling HWP if EPP is not supported.

       * Allow the driver to handle numeric EPP values in the sysfs
         interface and fix the setting of EPP via sysfs in the active
         mode.

       * Eliminate a static checker warning and clean up a kerneldoc
         comment.

   - Clean up some variable declarations in the powernv cpufreq driver
     (Wei Yongjun).

   - Fix up the ->enter_s2idle callback definition to cover the case
     when it points to the same function as ->idle correctly (Neal Liu).

   - Rearrange and clean up the PSCI cpuidle driver (Ulf Hansson).

   - Make the PM core emit "changed" uevent when adding/removing the
     "wakeup" sysfs attribute of devices (Abhishek Pandit-Subedi).

   - Add a helper macro for declaring PM callbacks and use it in the MMC
     jz4740 driver (Paul Cercueil).

   - Fix white space in some places in the hibernate code and make the
     system-wide PM code use "const char *" where appropriate (Xiang
     Chen, Alexey Dobriyan).

   - Add one more "unsafe" helper macro to the freezer to cover the NFS
     use case (He Zhe).

   - Change the language in the generic PM domains framework to use
     parent/child terminology and clean up a typo and some comment
     fromatting in that code (Kees Cook, Geert Uytterhoeven).

   - Update the operating performance points OPP framework (Lukasz Luba,
     Andrew-sh.Cheng, Valdis Kletnieks):

       * Refactor dev_pm_opp_of_register_em() and update related drivers.

       * Add a missing function export.

       * Allow disabled OPPs in dev_pm_opp_get_freq().

   - Update devfreq core and drivers (Chanwoo Choi, Lukasz Luba, Enric
     Balletbo i Serra, Dmitry Osipenko, Kieran Bingham, Marc Zyngier):

       * Add support for delayed timers to the devfreq core and make the
         Samsung exynos5422-dmc driver use it.

       * Unify sysfs interface to use "df-" as a prefix in instance
         names consistently.

       * Fix devfreq_summary debugfs node indentation.

       * Add the rockchip,pmu phandle to the rk3399_dmc driver DT
         bindings.

       * List Dmitry Osipenko as the Tegra devfreq driver maintainer.

       * Fix typos in the core devfreq code.

   - Update the pm-graph utility to version 5.7 including a number of
     fixes related to suspend-to-idle (Todd Brandt).

   - Fix coccicheck errors and warnings in the cpupower utility (Shuah
     Khan).

   - Replace HTTP links with HTTPs ones in multiple places (Alexander A.
     Klimov)"

* tag 'pm-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (71 commits)
  cpuidle: ACPI: fix 'return' with no value build warning
  cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode
  cpufreq: intel_pstate: Rearrange the storing of new EPP values
  intel_idle: Customize IceLake server support
  PM / devfreq: Fix the wrong end with semicolon
  PM / devfreq: Fix indentaion of devfreq_summary debugfs node
  PM / devfreq: Clean up the devfreq instance name in sysfs attr
  memory: samsung: exynos5422-dmc: Add module param to control IRQ mode
  memory: samsung: exynos5422-dmc: Adjust polling interval and uptreshold
  memory: samsung: exynos5422-dmc: Use delayed timer as default
  PM / devfreq: Add support delayed timer for polling mode
  dt-bindings: devfreq: rk3399_dmc: Add rockchip,pmu phandle
  PM / devfreq: tegra: Add Dmitry as a maintainer
  PM / devfreq: event: Fix trivial spelling
  PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent
  cpuidle: change enter_s2idle() prototype
  cpuidle: psci: Prevent domain idlestates until consumers are ready
  cpuidle: psci: Convert PM domain to platform driver
  cpuidle: psci: Fix error path via converting to a platform driver
  cpuidle: psci: Fail cpuidle registration if set OSI mode failed
  ...
2020-08-03 20:28:08 -07:00
Linus Torvalds e53bc3ff99 Boris is on vacation and he asked us to send you the pending RAS bits:
- Print the PPIN field on CPUs that fill them out
  - Fix an MCE injection bug
  - Simplify a kzalloc in dev_mcelog_init_device()
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oZ+ERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iIVQ//XLP3qky/X/DOcOhdqw7JEDUpElK+sLdb
 nFLcSP0vgdoip0jjz0voEf2tb0nRbZRqjnCIuHj5Dq8fTWbM/qcD0zM35N5lk5bJ
 N+y9cE+FQg2oayU/0O3IezoT0lAW3YzdE2HdzhZ2ZEiDKx3fAj9QoJbKlc4TxshC
 Tg93IpMq5dmnj6Ge3BfNE56O4WCLIQxs2MCIWHWjBEU8qb4Nry7vEL9hijmFQNRd
 85i9pDrOkdnLDKhTagHUR4HbhkAacWeJKe0UoraJFr8/1nCm8Hcii5hI1t5N9KXg
 /+asjkhLX75uYy2aMmwF7qelSLNbOieMNji7rt42vevBsqrAtX0zuSvbEBb8bNdx
 f4WS5GcmUTZVHTuppPzAX3YNmYdrGqmecerAY+1j+uHtWFGs07ZCnKx+kQbiK0m4
 LqaQmi16KsV/Jj9BYZu8wLd/uIAsRqnY5nJlbe7of0/kFPzlJkRkT8Z3DPhSp/ee
 qHc3eR1EFMQvNr8F1e17TPCTcq5YjLM9BXFX4JJppaX30YOb6ZH0z5jYMbouJeKD
 8qtQ1BtawWKqkoi/0PtbH6khRFXH6oUNWZ+2aAKQqWwfJDdP8Q7fFyIvgN487s79
 u7fN4h8qj7w1AeIRJQH5v1G/Es14gdBcotgkM2AflQLF+c7QormnHdCJZad4gTAj
 /X8Ks5DE51w=
 =z0MD
 -----END PGP SIGNATURE-----

Merge tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 RAS updates from Ingo Molnar:
 "Boris is on vacation and he asked us to send you the pending RAS bits:

   - Print the PPIN field on CPUs that fill them out

   - Fix an MCE injection bug

   - Simplify a kzalloc in dev_mcelog_init_device()"

* tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce, EDAC/mce_amd: Print PPIN in machine check records
  x86/mce/dev-mcelog: Use struct_size() helper in kzalloc()
  x86/mce/inject: Fix a wrong assignment of i_mce.status
2020-08-03 17:42:23 -07:00
Linus Torvalds a92ad11fb2 A single commit which sets the X86_FEATURE_TSC_KNOWN_FREQ flag for Xen guests,
to avoid recalibration.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oYlARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hA+Q//YeTzZc8UnXyLzIB0AnP/jAvgbn2KMmSD
 PmsBBTBD/q24hkMoj/hiyn0lwOHzJ6fnb4T08WYfpDbhbWt8bJVrfu6QWpK4duLP
 dr58iiqyZ78o/G4Nuzr755JU2/Yy18682t9if4038vuWu/osmlRbNExYj/dMTLFW
 5Q63dy9y8vB+/rWI2/MoxXjym+qy/S4Px2X79XMrazOsPBkbxUHwt99SsEHsjZRf
 +94+z04z/sXV5HznbdXHeKJsvvAaz+kelIOJ9kDskt4VTwgcPdYSlX8Wpun6O7uR
 YuJTCpk2Hsx3ERSKNCFIJ1+7uZsA8E2blcK35iiScaTCa7qCvpA9KC1hDpVqtH1B
 jEuJIG2VpBVx5oxYpZV2kiwEQzcsnBuIeiMS2Fd+SCLcAkPBxARgOlHu0yLSTtkl
 92cbRXp+w6N1SnIEMFwgkZAy3AjWTAXROXXyKiMMjLkaTFRLMgzcYWYWFVng3GrK
 4AUIUESrHeGHWvOa1vwTXmUCxSvazPH/WxyuG4pwkeZpH++peiRlKzHwo1TEXXCw
 59i8Sw4+OqdIN/ZmWQ8wB2v1oUGvA4j/Ta/GFmdyoPEII67XVcmtMuQV3tjUxLAb
 FdEgBzrdQp0yM/UYF2pQYJiV4qjdh2/K3wFkJI1c2rD6e5On54t51E0Cf9C1oyPH
 wn32OE5yr9U=
 =S8sw
 -----END PGP SIGNATURE-----

Merge tag 'x86-timers-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 timer update from Ingo Molnar:
 "Set the X86_FEATURE_TSC_KNOWN_FREQ flag for Xen guests, to avoid
  recalibration"

* tag 'x86-timers-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/xen/time: Set the X86_FEATURE_TSC_KNOWN_FREQ flag in xen_tsc_khz()
2020-08-03 17:41:06 -07:00
Linus Torvalds 5183a617ec The biggest change is the removal of SGI UV1 support, which allowed the
removal of the legacy EFI old_mmap code as well.
 
 This removes quite a bunch of old code & quirks.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oYBoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1izAg//adabmMctmGxpWETEj1UmZdgJcKu5l+H3
 SiC6hoB46NB5J6E2whhOXuJGqgBGY/qx9Qy3LMFwgXNki85TQEEFTcGfd+RcXTXz
 B+oASL5Z+KOJ+QQqFf8W/dYye4ZXXxQN8gFSl9zaY3iXSY28JLMFrwkckkIrBwLZ
 9DmB7NaxeMQ22IEan641sK/YafGIriHvMpe34s2/p2WpemyncG1y1tZcPxZ/9K9U
 7OZ+4O6gr9n3FloD9yp3z1jFXiU4gSWEgbZvy3bD/AjbXs+oaaaIFDZ6yM/C3JYo
 Qw0PPTBSXkKTmG3UaK4vml4NPROLqyAzv8k5UTJmYQiVlSkyAVW80mzMQ9qDAhfK
 ny2+XE7ahjsFiNLFbboSOqLuTyHogkzcFCBrwW1wCTnPDEX5QCQF9t0agrY4pqqZ
 8lSMLj0BmkAPoi5XHq4WVEJL+0GyYFWmJwdtUAzd2Hj5w6mse0TiL7PtqAUfdFH0
 E5nBnY5RvqtWwfEungGl+zFmSRvH8UzOUydMMuG8koNlGaNZF4TwNVrMukbVqArX
 34Zt1bQrLUY/T4xNM+Zx1BWOFd5D3BUuI7xVhE5zMg20A8PHtPzMByoomoQrgGX2
 MDVUoqnPxeXWUil06+ND6koOtaL+xJy0JsUIFs7h9SGsF+JxsrYKbbLfjjDvtpuK
 sdb+fYzmgBU=
 =NChe
 -----END PGP SIGNATURE-----

Merge tag 'x86-platform-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform updates from Ingo Molnar:
 "The biggest change is the removal of SGI UV1 support, which allowed
  the removal of the legacy EFI old_mmap code as well.

  This removes quite a bunch of old code & quirks"

* tag 'x86-platform-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Remove unused EFI_UV1_MEMMAP code
  x86/platform/uv: Remove uv bios and efi code related to EFI_UV1_MEMMAP
  x86/efi: Remove references to no-longer-used efi_have_uv1_memmap()
  x86/efi: Delete SGI UV1 detection.
  x86/platform/uv: Remove efi=old_map command line option
  x86/platform/uv: Remove vestigial mention of UV1 platform from bios header
  x86/platform/uv: Remove support for UV1 platform from uv
  x86/platform/uv: Remove support for uv1 platform from uv_hub
  x86/platform/uv: Remove support for UV1 platform from uv_bau
  x86/platform/uv: Remove support for UV1 platform from uv_mmrs
  x86/platform/uv: Remove support for UV1 platform from x2apic_uv_x
  x86/platform/uv: Remove support for UV1 platform from uv_tlb
  x86/platform/uv: Remove support for UV1 platform from uv_time
2020-08-03 17:38:43 -07:00
Linus Torvalds e96ec8cf9c The biggest change is to not sync the vmalloc and ioremap ranges for x86-64 anymore.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oW+oRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hEsBAAkFaP2QXd8Nc3nqA+eQoBwSty5NumQWNi
 /VC4nt2R3vqa547XgMgT0RslrftUeCZ6bI0gs7yG6vCrjHbsiFzqliOUPJPXHoUE
 MQMw0tw1x7ODVsAGkzJ5YYrhBvZ2rDvaewAHNRLdkmdLJwXobpCHk4YM48O0oInh
 rlTC9Lgu4jOGOUBazZt4a1xAZGbMK4uk6Er14HGONoqwHV0XC4lDnJm7qNWJLquE
 KEFPWyEC6/T4Vwtmc98eDd+/gJ3sDC2tRtRalW0U5BZQXWjbMLUh/g5aJF8g0NG1
 kmHgbIf5todhFgl/D7ZWJ/i6jucGSZ1u43IsOQwuQ2U/flsUrHDPxbLW55r1axld
 K/RdvWvanWBb1A77Z/2h9eMwm0saEBjQT/UmtT4fNQpgUHJ1AJaVLHcgeFWQsVVX
 AbrP6DAGkEV7QqxQjCl9PK4kOi6BNwvKIfKmFsy5+Ypgx9e39OZmd9FMNt7mJslC
 2htTdXsUYUgwCrQH9brUjUs943mrvWXUYdencvVw1ByWwkSTRm5owXL0w3XtKuwN
 fEtcq0Y6ALeTwjyN0sja3johR93qaHWmyI8t5NLzT+g1rQgrd9Hae6dChyx1nzlz
 I5xIQF4ropZhhP6vcifiyDQcltOZsspVOMQzLTepuqZEpqDEoUtbjZhI4JcS51nc
 qe5rEOGGa00=
 =D9UW
 -----END PGP SIGNATURE-----

Merge tag 'x86-mm-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 mmm update from Ingo Molnar:
 "The biggest change is to not sync the vmalloc and ioremap ranges for
  x86-64 anymore"

* tag 'x86-mm-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/64: Make sync_global_pgds() static
  x86/mm/64: Do not sync vmalloc/ioremap mappings
  x86/mm: Pre-allocate P4D/PUD pages for vmalloc area
2020-08-03 17:25:42 -07:00
Linus Torvalds c813e8c9df Filter MSR writes from user-space by default, and print a syslog entry if
they happen outside the allowed set of MSRs, which is a single one for now,
 MSR_IA32_ENERGY_PERF_BIAS.
 
 The plan is to eventually disable MSR writes by default (they can still be
 enabled via allow_writes=on).
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oWVwRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j61Q/+KDteLFjfVaJIEe0yfZ8YWjAPb1HRDg1u
 ia8wyZael5hHpLOQSOhkaTwbvF8dMSLK3oky8ZO2oA5gv99jyB+OZxOGS6mvMcGM
 aI9chECEghlvUKy1i+NNHSK4W9p4SSu0eK57EkX1VRalZmntTVEX7nlw7oI6nacW
 8iP7MIqYWdF1QI6n5gDh0i/62/ictOjG2yemooPNh1I5E5Cw+MhOqTZ7gL+nu1M6
 e2+oqo9i3+1yBAN3RH4M8tCwrtTwoVnmQlgPAqT5u2eOATu1b8cTOs2/aC6dfcGg
 3u1hcbXuQKRE4DuT7QhDJN1cQZt/GMdymTkR7xIlVqzmNPkM+DCiWPujVMU7r2ia
 z1Chwxd1gmDrX3g6XPrcXneGrasLkHIRAwYS8EN2wZyNIpb71Y9/uWZVJPza6OO2
 Gdxz/y1vCYUu43FSy+gCCirWW+h/HFc4BEzikvB+splQPSiJWJw/Nol+xkBnPds7
 LjQTpKSXd1nDtDyTsz0nZAp7EEft78Ez2mj4zYNgqZDBXLEiRPES47VEO3Ls9SS4
 t7AtIY1GtdUlFIPrA0mrbcggf/gYxJCFofxR2J7gzyH3DFXIlfw9HbS4+r4xv9RP
 v2wJUwN35y5SJRNymztanArTNPkGXOlcJ3s7nYQk13UQ0hWTtW3e86pMPsi5C3ui
 6DjHG8GXWh4=
 =X+M1
 -----END PGP SIGNATURE-----

Merge tag 'x86-misc-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 MSR filtering from Ingo Molnar:
 "Filter MSR writes from user-space by default, and print a syslog entry
  if they happen outside the allowed set of MSRs, which is a single one
  for now, MSR_IA32_ENERGY_PERF_BIAS.

  The plan is to eventually disable MSR writes by default (they can
  still be enabled via allow_writes=on)"

* tag 'x86-misc-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/msr: Filter MSR writes
2020-08-03 17:23:49 -07:00
Linus Torvalds 69094c2032 A single commit that removes the microcode loader's FW_LOADER coupling.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oVl8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gb5w//Y6V8o2qam5GMG3CAFMWeuaPLehTLX+e3
 GxXB+rQpVbe8WrroroP14KOoYYI5Nq6T/giO5HwL0h3/CbYZr8xcv12HSh7AQ8hX
 Zluybu9eN3P4ot0HEyIOphQCSppvjF9E0Zljs77miDbAC7AljdF/BQ3aqXOSxmJR
 haMvO0VDtE36JaxKIBKcrt/dmk9iSfdlY2FFjZ1Ia52bDplFzDHEmBi/3MfjahXa
 AkygeTwoRvUsvpBiY6jzRDLJb6JMuP8bqoOxEhzbNWKNye/EYUJdIehWl772+Rdz
 U+iXshAT302XPJ88Fo2VOQU3JhLbWHf9VBpMA+DrnWh0bFKvB6nZKlPu9TkMo25J
 WbKuyKFKoxUHXLH1mr8MkKHrUXDcnocBIn//n9WXQLA2/gTsxmN/Wtt0y0f5iQK0
 Y3KH/GfkdJimbIg5Y/NvepbRlghQomm919IRtECOw3QkPMdgDarpEh3Un8VSSoI3
 6pj9NPNmvnaP8wEg4BZR55YdMC+7EsYT+oRWdBiWeGbu6+TSCWjamrTrA2e0OtLM
 Jw25YS67iLVJW7Vsp3FnBovnJ3FxY/ss0OcF70VPaj6/P8YZhaESqoZfUs8U2NRP
 IxP8dxlxIM6AT7XdKmLDQ7S2YtEdCvoJtoK2yU4Nx1hNusy3d6WeOxLRbnofG4yo
 e3vRLccLt6M=
 =/LiU
 -----END PGP SIGNATURE-----

Merge tag 'x86-microcode-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 microcode update from Ingo Molnar:
 "Remove the microcode loader's FW_LOADER coupling"

* tag 'x86-microcode-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Do not select FW_LOADER
2020-08-03 17:22:13 -07:00
Linus Torvalds 335ad94c21 Misc changes:
- Prepare for Intel's new SERIALIZE instruction
  - Enable split-lock debugging on more CPUs
  - Add more Intel CPU models
  - Optimize stack canary initialization a bit
  - Simplify the Spectre logic a bit
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oTsQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gueQ//Vh9sTi8+q5ZCxXnJQOi59SZsFy1quC2Q
 6bFoSQ46npMBoYyC2eDQ4exBncWLqorT8Vq/evlW3XPldUzHKOk7b4Omonwyrrj5
 dg5fqcRjpjU8ni6egmy4ElMjab53gDuv0yNazjONeBGeWuBGu4vI2bP2eY3Addfm
 2eo2d5ZIMRCdShrUNwToJWWt6q4DzL/lcrVZAlX0LwlWVLqUCdIARALRM7V1XDsC
 udxS8KnvhTaJ7l63BSJREe3AGksLQd9P4UkJS4IE4t0zINBIrME043BYBMTh2Vvk
 y3jykKegIbmhPquGXG8grJbPDUF2/3FxmGKTIhpoo++agb2fxt921y5kqMJwniNS
 H/Gk032iGzjjwWnOoWE56UeuDTOlweSIrm4EG22HyEDK7kOMJusjYAV5fB4Sv7vj
 TBy5q0PCIutjXDTL1hIWf0WDiQt6eGNQS/yt3FlapLBGVRQwMU/pKYVVIOIaFtNs
 szx1ZeiT358Ww8a2fQlb8pqv50Upmr2wqFkAsMbm+NN3N92cqK6gJlo1p7fnxIuG
 +YVASobjsqbn0S62v/9SB/KRJz07adlZ6Tl/O/ILRvWyqik7COCCHDVJ62Zzaz5z
 LqR2daVM5H+Lp6jGZuIoq/JiUkxUe2K990eWHb3PUpOC4Rh73PvtMc7WFhbAjbye
 XV3eOEDi65c=
 =sL2Q
 -----END PGP SIGNATURE-----

Merge tag 'x86-cpu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu updates from Ingo Molar:

 - prepare for Intel's new SERIALIZE instruction

 - enable split-lock debugging on more CPUs

 - add more Intel CPU models

 - optimize stack canary initialization a bit

 - simplify the Spectre logic a bit

* tag 'x86-cpu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Refactor sync_core() for readability
  x86/cpu: Relocate sync_core() to sync_core.h
  x86/cpufeatures: Add enumeration for SERIALIZE instruction
  x86/split_lock: Enable the split lock feature on Sapphire Rapids and Alder Lake CPUs
  x86/cpu: Add Lakefield, Alder Lake and Rocket Lake models to the to Intel CPU family
  x86/stackprotector: Pre-initialize canary for secondary CPUs
  x86/speculation: Merge one test in spectre_v2_user_select_mitigation()
2020-08-03 17:08:02 -07:00
Linus Torvalds 4ee4810315 Improve x86 debuggability: print registers with the same log level as the backtrace.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oSVwRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iVgA//dJgCrIXMtv30UCHH6i4pWqUB9uPAoqk4
 tK3A6L2+D+1nSbREe3DW/8Lz5val3KjfZmDkwZzdtT6W9B3lv7fNE8dU/2yA18Ov
 DPAAf+s+A/rQlNOi+PNQ9tm4zRitHQzdth2p9CeR+Ty5WcaPZ+sTjlSZH4me6a2h
 M4bK8pemseifO7DLlB+ZehrsFF1dcxqgKeZGXY4bpL7w3iunfc8+SNhVa3QX2Yay
 i1XqCHXaAL7LEpmxyTtH0NYrMoEO2z6qpJ4r93I41ku4lHBXjyY+6siJYHX4cmRr
 RoUbPNlI6BWODS+O0RRYnwlWwkqzC6kahHWjsq8kq6n3QXDXHh8oqqw+qV5BCtSL
 q6CBCAjjYOCH+fMKp22U23kKG+YpTtbtFEewGhMvto5Czis4UdLcuGjuLSmhSrYC
 Lc78DE6V2xNb5bb769oVHSk99ztb/6JhJk3jvIpCAY4Trl52m4RZBTNAGjlSekyQ
 A1EA7siA7W6Ccs0S0XXXVJ8Uzx/UzwzLFfPGZJPCWV/G/+CH4Gq+7dkKh6TKZ+pu
 Cc8yFxs2eBRr6a8GZtmlCN4ydLBkrOzwgWRaLO1CcZC/mgUJhWKoQYuhuuN/Fcnv
 q208MaVOXzV4mAHlqjqTccmFVeB5rL4/+s+7vADj0CBi8xFLkiIySmBrVyy+01Q6
 UfhzuNsJBUk=
 =zyiS
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 debug fixlets from Ingo Molnar:
 "Improve x86 debuggability: print registers with the same log level as
  the backtrace"

* tag 'x86-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/dumpstack: Show registers dump with trace's log level
  x86/dumpstack: Add log_lvl to __show_regs()
  x86/dumpstack: Add log_lvl to show_iret_regs()
2020-08-03 17:00:00 -07:00
Linus Torvalds 37e88224c0 Misc cleanups all around the place.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oRTgRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1huHQ//T2hZk5zlpOtojxvdAzsPgtV4tHawseK8
 +ZZEbrH5qo5/ZMF18qyEJCm9p1yg8uIu71InULRCSgjU3v82GVCcuLXuE36U904G
 gHUqkYPnqxCqx+Li125aye9tKWahXe1DxX+uWbV0Ju7fiCO0rwYIzpWn1bnR6ilp
 fmLGSbgPlTVJwZ9mBvyi3VUlH5tDYidFN74TREUOwx2g5uhg+8uEo44Eb/bx8ESF
 dGt1Z/fnfDHkUZtmhzJk5Uz8nbw7rPHU/EZ4iZAxEzxTutY5PhsvbIfLO4t4HhGn
 utZCk/pIdiLLQ1GaTvFxqi3iolDqpOuXpnDlfEAJD8UlMCnwyh1Certq5LaRbtHS
 8SW3/CeJgzqzrrsYhkxVu2PMFWriSMxgKTLiN0KnzJN0Hu7A5lHbBY/6G7zpsF/A
 2KJ4e8lZiPCcNF7LteSRroUe4hNOYxZ2FlYTXm3AgycSL189UMfWlHFb5c+b4m1a
 cNJpz+jAom8foXN4KhRkl5PFKXVXDGTVln3NRJCh1Mqd1Ef4hsTo9H6FgHX/EfHg
 slJDwwPac80v0dzlMTSsMkyseaKRAqIObWOiknPt1wv/qja7ibVZ5mUbZ+/mfJX/
 YWybcPi1omgUSNt7TNx6jtma67rUjmJW0x9g7UJ/ttEkf6yG2lemrdusydBYuIni
 0Z2+hWzI9MM=
 =X7o0
 -----END PGP SIGNATURE-----

Merge tag 'x86-cleanups-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Ingo Molnar:
 "Misc cleanups all around the place"

* tag 'x86-cleanups-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ioperm: Initialize pointer bitmap with NULL rather than 0
  x86: uv: uv_hub.h: Delete duplicated word
  x86: cmpxchg_32.h: Delete duplicated word
  x86: bootparam.h: Delete duplicated word
  x86/mm: Remove the unused mk_kernel_pgd() #define
  x86/tsc: Remove unused "US_SCALE" and "NS_SCALE" leftover macros
  x86/ioapic: Remove unused "IOAPIC_AUTO" define
  x86/mm: Drop unused MAX_PHYSADDR_BITS
  x86/msr: Move the F15h MSRs where they belong
  x86/idt: Make idt_descr static
  initrd: Remove erroneous comment
  x86/mm/32: Fix -Wmissing prototypes warnings for init.c
  cpu/speculation: Add prototype for cpu_show_srbds()
  x86/mm: Fix -Wmissing-prototypes warnings for arch/x86/mm/init.c
  x86/asm: Unify __ASSEMBLY__ blocks
  x86/cpufeatures: Mark two free bits in word 3
  x86/msr: Lift AMD family 0x15 power-specific MSRs
2020-08-03 16:53:28 -07:00
Linus Torvalds 1ff9b20b47 Misc changes: refresh defconfigs and simplify the boot image link stage.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oQioRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1i8Fg/+IC2Zyg5Jvh8zkYI+6ujQqk0RybUixR14
 +qQ3DRjiKau8C/4/KN5QPpwn6aCtWJTeRszbRJhh2cFEDoPTKQaxPZgc7Y4B8Mok
 YULRi6DshLJZShnL6mN2LziyBTQeBRMBgWWcBcotVfm+XBMjCTbTPCq0vuQU5MUS
 xWNOs1727m8r+Fvb/kjEtBXUrQCL4p2YeW85+hjCzsd96U4io52qkOWdVK2+GbDw
 WFGu6M4SYmtvZVcnnYmMBYuHIKF+cXYaeztVIxjfVQzYnZjljlKhsbuZKaRaz3od
 7lqj76kCXnm3XRdndjPSaSRAtBLu9h7L2ayz5Q03dOTwHYlqvgnhF/4Qw7XngW0L
 F9rcboWp6H7v9NPyBczFd9jnwsDy9rMDEvVGqLAlH6sgIVRNVKUfp4tK8r605P97
 88eWptGL8k3xFSt4deLCTFhXH6zn6PigwgFEKxH7cXwf/5woF/g32yo6zBIH8F3N
 IcA7vARvoi70I5do+NQdrjEG+qwhmdJhWm3sqzuq/MkPwvjIIoUnEgnnUz3DP5Mh
 MbxCOH0Z+9df9SaDN+oh8NNSEMAdhkZ6IAR0ysNPcKsVcIYk0vmJ8z3RiYhhd4vT
 D6p4brg0m8HgeAYXt3A4vjhKs/fGdMu9HrdCeovLlbaUOpJQ3NO5sSzzYxbGtEQG
 2EmgvWj7MpM=
 =t/9f
 -----END PGP SIGNATURE-----

Merge tag 'x86-build-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 build updates from Ingo Molnar:
 "Misc changes: refresh defconfigs and simplify the boot image link
  stage"

* tag 'x86-build-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/defconfigs: Refresh defconfig files
  x86/build: Move max-page-size option to LDFLAGS_vmlinux
  x86/defconfigs: Remove CONFIG_CRYPTO_AES_586 from i386_defconfig
2020-08-03 16:51:34 -07:00
Linus Torvalds c0dfadfed8 The main change in this cycle was to add support for ZSTD-compressed
kernel and initrd images.
 
 ZSTD has a very fast decompressor, yet it compresses better than gzip.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oNX0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jdcg/9GaPGjmNgMqi3tbfzU3z11OrbraRBgMj5
 jHIZ89DuzwsqU+jbwGHGiF45ge85iPK6i2ovR3ePzL0LAlLYT3gqzPcl3kkog4E9
 0E0JAddx974uW4toc8cGFEHNf4vXtvvi45FL2yvDoap9xLEcpJsQRdu9upPB4U3s
 +qotO6wJitM74g4l2WdbStzCAcL4ZXFA/ix19nUyLh4QlFBDqUHwufIhW1G0ciL4
 txMXJ23L7e+b6FUvGyK3vFhba1isPdz5xQdQTy2DCK20rQhGu1IBsqzymEibbgIp
 /j4yHfUKSpxdblFcpZfknI1VM1mbt/WN5dKDKm9UnYBhA/R/2PN0klfrAQAT4SOS
 sP3bxXqTRXBjmop0NjOLCdjGCySYnPLFPlB6REIrMcvs6LYUSTqMZEusj7McwD7h
 IqS4zGEMa5A+c6Q4160Qz+zrXIyh/n/bTR/6uOKUktkUQaJ+079P64NK9RtCYZTk
 dkIHJChjmWZGxxXHEbo+4e7bM8gAMHDmX2pdWE5u72oYJRqBv7PVyl+SHBk+onxM
 crtKvqOp8Q8coirlfjx5UynZeZmH1VuIFjpvnwlAtqxZGvuTWZ0ojq3E3Y/XwHQj
 bVejr9AQ1gS9ZBTKwwd5cf7mnOuiXrHrBP3E7buoRw8bWtL+yqHyybqccZnSOUVN
 lGFshs+7J5o=
 =bARW
 -----END PGP SIGNATURE-----

Merge tag 'x86-boot-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 boot updates from Ingo Molnar:
 "The main change in this cycle was to add support for ZSTD-compressed
  kernel and initrd images.

  ZSTD has a very fast decompressor, yet it compresses better than gzip"

* tag 'x86-boot-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation: dontdiff: Add zstd compressed files
  .gitignore: Add ZSTD-compressed files
  x86: Add support for ZSTD compressed kernel
  x86: Bump ZO_z_extra_bytes margin for zstd
  usr: Add support for zstd compressed initramfs
  init: Add support for zstd compressed kernel
  lib: Add zstd support to decompress
  lib: Prepare zstd for preboot environment, improve performance
2020-08-03 16:03:23 -07:00
Linus Torvalds ba77c568f3 A couple of changes, concentrated into the percpu code, to enable
Clang support on i386 kernels too.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oM2wRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gS7w//edNcbM5/ZZhQ9trlDoZDOj5jyjqhXXqr
 gY4G0/BSHmV/hjn2CTjQ1GJNDEOxg3AcLklXrXxJ9TxUnoRyju8yoOCyfjTAUHeL
 Nq/wrEUZonMY0axcgmEET5VTfrQ3aXEuNFQ3WemlGDo0E+Cg906CcUXRUS2KvzXd
 dQszdG7GnfOgrrsbTWGS+r8jPPyBdm3xM3lDRQhHyaXrHNDCXHyQXbJqZgS1srWW
 PR0qiVdnf/KqT5FdzF8OWHCXJAx3NjNZ4qNXW61HM7G5Z/S/2TTq3mTYqbFufGc9
 Mmx1rBnG4oiqBFClVKy2VgoA9xUHUs78g2abwJPICW1eDTP1ED1EoxpTPp6JKojs
 kQkaxVPt64eP3bEcnpeXTjGtspe6cxkAIZkxwLbpzqs20lMFMctCi0j/Ev12wCfj
 I+SklQc2lZPHXqGIH8eFgVDLoVJ4lpUsz1YSaL/9mEFCF2/UMmDiZzb5fn6SWQJf
 fudLbhbmecS6/FVpZm7SdWEQwc3YMk4kYyEJs/xN//HUNde00hxaJJUUKbxVfu9C
 ihl+VJ2+dbdTNv7HJVwcK6TZ0+XXXEefXhBo8xi4NX5+Ai1Bf+r2ZC60EUYo59fN
 XK6Q1wS936dmyJbb1gtjcjqzJ0ErXTBqGuwaHjnNmUvVBEjQ34iopXtaQmETnTXa
 QALxFOM85fQ=
 =lmlO
 -----END PGP SIGNATURE-----

Merge tag 'x86-asm-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 asm updates from Ingo Molnar:
 "A couple of changes, concentrated into the percpu code, to enable
  Clang support on i386 kernels too"

[ And cleans up the macros to generate percpu ops a lot too  - Linus ]

* tag 'x86-asm-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/uaccess: Make __get_user_size() Clang compliant on 32-bit
  x86/percpu: Remove unused PER_CPU() macro
  x86/percpu: Clean up percpu_stable_op()
  x86/percpu: Clean up percpu_cmpxchg_op()
  x86/percpu: Clean up percpu_xchg_op()
  x86/percpu: Clean up percpu_add_return_op()
  x86/percpu: Remove "e" constraint from XADD
  x86/percpu: Clean up percpu_add_op()
  x86/percpu: Clean up percpu_from_op()
  x86/percpu: Clean up percpu_to_op()
  x86/percpu: Introduce size abstraction macros
2020-08-03 15:58:06 -07:00
Linus Torvalds 97c6f57dc9 A single commit that improves the alternatives patching syslog debug output.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oJc8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jlkRAAkF72DJULcyWrBvoNNeawf+I1DBlRnz0Y
 brZDicQ4SOe5zlIywhUNJHRMY1JCu9qpBaPFVh9yerdaK3IgV6eYE8u+NVrf4Jth
 4azle40SMI2W5eJe73JIEdo3Yn9/2fOZnuwpIxLdeNBCMvLNEuSDwiY+J7kVRfRZ
 OtrkD8mbrZGqOaZxk/AKU4KO0Rh0SfMsaZ6nxdpGOxkh1KwEnClb4ehOo1SXL2xl
 JyNWINOJkjoJhrUTJXmQIJ2lJTtHiI8DttnkPYgygnarBvC3uwoEo7e2O7X696Wm
 iZy75iGjW/yLWHsobJJKwhs16ijFp3P/S+l7BS9w5327ZIemrr7eSbVSu1lOWxGT
 c7tJsPPhfo6658+Zaq9u9LEfCotn4oHREI8HmDTTQiF+MwF2T/R1g0cqgAV/Woo9
 UF+uSbyFsAfoz9N/B+aMfS5rpuSE8ugfZ654GV2nOLhSlDsmUkfOpbzL6PBJSF9P
 DaB4FW7JMHFAU0mVaCAMtAahTp5PPLIdfV/m0AmTIIrMgtmH53Ta/Tt0xFoQy6Ev
 u8GmS0SzM4xgZIe0rKaSgXAooI9eyY5zqF0s8Mwj64+55DLwQ6aEkBxyDiVJzO+b
 nY8t52do6OLCxnG9QEJEsXJXipyFzWOvcj7PTZuZRUACpLe8Bxt42ViTr1Mu0hZs
 ojVETEqeNcU=
 =cpIN
 -----END PGP SIGNATURE-----

Merge tag 'x86-alternatives-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86/alternatives update from Ingo Molnar:
 "A single commit that improves the alternatives patching syslog debug
  output"

* tag 'x86-alternatives-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternatives: Add pr_fmt() to debug macros
2020-08-03 15:56:37 -07:00
Linus Torvalds e4cbce4d13 The main changes in this cycle were:
- Improve uclamp performance by using a static key for the fast path
 
  - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for
    better power efficiency of RT tasks on battery powered devices.
    (The default is to maximize performance & reduce RT latencies.)
 
  - Improve utime and stime tracking accuracy, which had a fixed boundary
    of error, which created larger and larger relative errors as the values
    become larger. This is now replaced with more precise arithmetics,
    using the new mul_u64_u64_div_u64() helper in math64.h.
 
  - Improve the deadline scheduler, such as making it capacity aware
 
  - Improve frequency-invariant scheduling
 
  - Misc cleanups in energy/power aware scheduling
 
  - Add sched_update_nr_running tracepoint to track changes to nr_running
 
  - Documentation additions and updates
 
  - Misc cleanups and smaller fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oJDURHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ixLg//bqWzFlfWirvngTgDxDnplwUTyKXmMCcq
 R1IYhlyK2O5FxvhbRmdmW11W3yzyTPvgCs6Q/70negGaPNe2w1OxfxiK9NMKz5eu
 M1LoXas7pL5g7Pr/ZxxHk/8VqJLV4t9MkodiiInmV6lTaznT3sU6a/kpYQjJyFnG
 Tuu9jd6JhdRKmePDJnNmUBoGQ7JiOQDcX4HtkcQ3OA+An3624tmJzbW1yts+uj7J
 ZWo2EY60RfbA9MxQXGPOaR/nAjngWs4Q6tddAh10mftsPq1gR2iFUKju1d31MQt/
 RHLdiqJf+AyUC4popKG7a+7ilCKMBwPociSreTJNPyEUQ1X4AM3vUVk4yjUoiDph
 k2WdsCF8/JRdhXg0NnrpPUqOaAbQj53EeXnitEb92E7WyTZgLOvAtpV//xZo6utp
 2QHerfrQ9SoGQjz/ho78za5vQtV1x25yDhd+X4XV4QEhIy85G9/2JCpC/Kc/TXLf
 OO7A4X69XztKTEJhP60g8ldCPUe4N2vbh1vKY6oAD8AFQVVNZ6n7375/Qa//b0/k
 ++hcYkPc2EK97/aBFdvzDgqb7aUo7Mtn2ibke16sQU4szulaoRuAHQG4jdGKMwbD
 dk2VBoxyxeYFXWHsNneSe87+ha3sd0dSN0ul1EB/SlFrVELMvy634YXnMYGW8ima
 PzyPB0ezpuA=
 =PbO7
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Improve uclamp performance by using a static key for the fast path

 - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for
   better power efficiency of RT tasks on battery powered devices.
   (The default is to maximize performance & reduce RT latencies.)

 - Improve utime and stime tracking accuracy, which had a fixed boundary
   of error, which created larger and larger relative errors as the
   values become larger. This is now replaced with more precise
   arithmetics, using the new mul_u64_u64_div_u64() helper in math64.h.

 - Improve the deadline scheduler, such as making it capacity aware

 - Improve frequency-invariant scheduling

 - Misc cleanups in energy/power aware scheduling

 - Add sched_update_nr_running tracepoint to track changes to nr_running

 - Documentation additions and updates

 - Misc cleanups and smaller fixes

* tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  sched/doc: Factorize bits between sched-energy.rst & sched-capacity.rst
  sched/doc: Document capacity aware scheduling
  sched: Document arch_scale_*_capacity()
  arm, arm64: Fix selection of CONFIG_SCHED_THERMAL_PRESSURE
  Documentation/sysctl: Document uclamp sysctl knobs
  sched/uclamp: Add a new sysctl to control RT default boost value
  sched/uclamp: Fix a deadlock when enabling uclamp static key
  sched: Remove duplicated tick_nohz_full_enabled() check
  sched: Fix a typo in a comment
  sched/uclamp: Remove unnecessary mutex_init()
  arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE
  sched: Cleanup SCHED_THERMAL_PRESSURE kconfig entry
  arch_topology, sched/core: Cleanup thermal pressure definition
  trace/events/sched.h: fix duplicated word
  linux/sched/mm.h: drop duplicated words in comments
  smp: Fix a potential usage of stale nr_cpus
  sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal
  sched: nohz: stop passing around unused "ticks" parameter.
  sched: Better document ttwu()
  sched: Add a tracepoint to track rq->nr_running
  ...
2020-08-03 14:58:38 -07:00
Linus Torvalds b34133fec8 - HW support updates:
- Add uncore support for Intel Comet Lake
 
    - Add RAPL support for Hygon Fam18h
 
    - Add Intel "IIO stack to PMON mapping" support on Skylake-SP CPUs,
      which enumerates per device performance counters via sysfs and enables
      the perf stat --iiostat functionality
 
    - Add support for Intel "Architectural LBRs", which generalized the model
      specific LBR hardware tracing feature into a model-independent, architected
      performance monitoring feature. Usage is mostly seamless to tooling, as the
      pre-existing LBR features are kept, but there's a couple of advantages
      under the hood, such as faster context-switching, faster LBR reads,
      cleaner exposure of LBR features to guest kernels, etc.
 
      ( Since architectural LBRs are supported via XSAVE, there's related
        changes to the x86 FPU code as well. )
 
  - ftrace/perf updates: Add support to add a text poke event to record changes
                         to kernel text (i.e. self-modifying code) in order to
                         support tracers like Intel PT decoding through
                         jump labels, kprobes and ftrace trampolines.
 
  - Misc cleanups, smaller fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8oAgcRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gcSA/9EwKeLF03jkEXwzF/a/YhCxZXODH/klz/
 5D/Li+0HJy9TTVQWaSOxu31VcnWyAPER97aRjHohNMrAFKpAC4GwzxF2fjKUzzKJ
 eoWIgXvtlMM+nQb93UTB2+9Z3eHBEpKsqP8oc6qeXa74b2p3WfmvFRPBWFuzmOlH
 nb26F/Cu46HTEUfWvggU9flS0HpkdZ8X2Rt14sRwq5Gi2Wa/5+ygaksD+5nwRlGM
 r7jBrZBDTOGhy7HjrjpDPif056YU31giKmMQ/j17h1NaT3ciyXYSi0FuKEghDCII
 2OFyH0wZ1vsp63GISosIKFLFoBmOd4He4/sKjdtOtnosan250t3/ZDH/7tw6Rq2V
 tf1o/dMbDmV9v0lAVBZO76Z74ZQbk3+TvFxyDwtBSQYBe2eVfNz0VY4YjSRlRIp0
 1nIbJqiMLa7uquL2K4zZKapt7qsMaVqLO4YUVTzYPvv3luAqFLvC83a2+hapz4cs
 w4nET8lpWanUBK0hidQe1J6NPM4v1mnsvuZfM0p/QwKN9uvV5KoT6YJhRqfTy51g
 je+G80q0XqOH0H8x9iWuLiJe0G72UyhRqzSTxg+Cjj9cAhnsFPFLCNMWSVHqioLP
 JXGQiTp+6SQM6JDXkj5F8InsyT4KfzqizMSnAaH+6bsv9iQKDL4AbD7r92g6nbN9
 PP43QQh23Fg=
 =4pKU
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event updates from Ingo Molnar:
 "HW support updates:

   - Add uncore support for Intel Comet Lake

   - Add RAPL support for Hygon Fam18h

   - Add Intel "IIO stack to PMON mapping" support on Skylake-SP CPUs,
     which enumerates per device performance counters via sysfs and
     enables the perf stat --iiostat functionality

   - Add support for Intel "Architectural LBRs", which generalized the
     model specific LBR hardware tracing feature into a
     model-independent, architected performance monitoring feature.

     Usage is mostly seamless to tooling, as the pre-existing LBR
     features are kept, but there's a couple of advantages under the
     hood, such as faster context-switching, faster LBR reads, cleaner
     exposure of LBR features to guest kernels, etc.

     ( Since architectural LBRs are supported via XSAVE, there's related
       changes to the x86 FPU code as well. )

  ftrace/perf updates:

   - Add support to add a text poke event to record changes to kernel
     text (i.e. self-modifying code) in order to support tracers like
     Intel PT decoding through jump labels, kprobes and ftrace
     trampolines.

  Misc cleanups, smaller fixes..."

* tag 'perf-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
  perf/x86/rapl: Add Hygon Fam18h RAPL support
  kprobes: Remove unnecessary module_mutex locking from kprobe_optimizer()
  x86/perf: Fix a typo
  perf: <linux/perf_event.h>: drop a duplicated word
  perf/x86/intel/lbr: Support XSAVES for arch LBR read
  perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch
  x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature
  x86/fpu/xstate: Support dynamic supervisor feature for LBR
  x86/fpu: Use proper mask to replace full instruction mask
  perf/x86: Remove task_ctx_size
  perf/x86/intel/lbr: Create kmem_cache for the LBR context data
  perf/core: Use kmem_cache to allocate the PMU specific data
  perf/core: Factor out functions to allocate/free the task_ctx_data
  perf/x86/intel/lbr: Support Architectural LBR
  perf/x86/intel/lbr: Factor out intel_pmu_store_lbr
  perf/x86/intel/lbr: Factor out rdlbr_all() and wrlbr_all()
  perf/x86/intel/lbr: Mark the {rd,wr}lbr_{to,from} wrappers __always_inline
  perf/x86/intel/lbr: Unify the stored format of LBR information
  perf/x86/intel/lbr: Support LBR_CTL
  perf/x86: Expose CPUID enumeration bits for arch LBR
  ...
2020-08-03 14:51:09 -07:00
Linus Torvalds 9ba19ccd2d These were the main changes in this cycle:
- LKMM updates: mostly documentation changes, but also some new litmus tests for atomic ops.
 
  - KCSAN updates: the most important change is that GCC 11 now has all fixes in place
                   to support KCSAN, so GCC support can be enabled again. Also more annotations.
 
  - futex updates: minor cleanups and simplifications
 
  - seqlock updates: merge preparatory changes/cleanups for the 'associated locks' facilities.
 
  - lockdep updates:
     - simplify IRQ trace event handling
     - add various new debug checks
     - simplify header dependencies, split out <linux/lockdep_types.h>, decouple
       lockdep from other low level headers some more
     - fix NMI handling
 
  - misc cleanups and smaller fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8n9/wRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hZFQ//dD+AKw9Nym+WbylovmeD0qxWxPyeN/jG
 vBVDTOJIJLtZTkZf6YHcYOJlPwaMDYUQluqTPQhsaQZy/NoEb5NM2cFAj2R9gjyT
 O8665T1dvhW9Sh353mBpuwviqdrnvCeHTBEcglSlFY7hxToYAflUN0+DXGVtNys8
 PFNf3L9SHT0GLVC8+di/eJzQaRqxiB0Pq7kvh2RvPJM/dcQNA9Ho3CCNO5j6qGoY
 u7OnMT8xJXkgbdjjUO4RO0v9VjMuNthZ2JiONDgvgKtJfIL2wt5YXIv1EYX0GuWp
 WZgIzE4o1G7GJOOzKpFfZFyK8grHu2fWgK1plvodWjlLkBmltJZ1qyOM+wngd/m2
 TgtPo73/YFbxFUbbBpkb0eiIaH2t99kMvfCWd05+GiPCtzn9UL9GfFRWd42vonwc
 sQWjFrHKlnuzifUfNcLmKg7R2nUtF3Dm/SydiTJ+9NtH/QA17YJKWnlE1moulNtQ
 p7H7+8UdcvSQ7F38A74v2IYNIyDsv5qcE8ar4QHdaanBBX/LCyD0UlfgsgxEReXf
 GDKkpx7LFQlI6Y2YB+dZgkCwhNBl3/OQ3v6hC95B37fA67dAIQyPIWHiHbaM+029
 gghqU4GcUcbjSnHPzl9PPL+hi9MyXrMjpb7CBXytg4NI4EE1waHR+0kX14V8ndRj
 MkWQOKPUgB0=
 =3MTT
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - LKMM updates: mostly documentation changes, but also some new litmus
   tests for atomic ops.

 - KCSAN updates: the most important change is that GCC 11 now has all
   fixes in place to support KCSAN, so GCC support can be enabled again.
   Also more annotations.

 - futex updates: minor cleanups and simplifications

 - seqlock updates: merge preparatory changes/cleanups for the
   'associated locks' facilities.

 - lockdep updates:
    - simplify IRQ trace event handling
    - add various new debug checks
    - simplify header dependencies, split out <linux/lockdep_types.h>,
      decouple lockdep from other low level headers some more
    - fix NMI handling

 - misc cleanups and smaller fixes

* tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  kcsan: Improve IRQ state trace reporting
  lockdep: Refactor IRQ trace events fields into struct
  seqlock: lockdep assert non-preemptibility on seqcount_t write
  lockdep: Add preemption enabled/disabled assertion APIs
  seqlock: Implement raw_seqcount_begin() in terms of raw_read_seqcount()
  seqlock: Add kernel-doc for seqcount_t and seqlock_t APIs
  seqlock: Reorder seqcount_t and seqlock_t API definitions
  seqlock: seqcount_t latch: End read sections with read_seqcount_retry()
  seqlock: Properly format kernel-doc code samples
  Documentation: locking: Describe seqlock design and usage
  locking/qspinlock: Do not include atomic.h from qspinlock_types.h
  locking/atomic: Move ATOMIC_INIT into linux/types.h
  lockdep: Move list.h inclusion into lockdep.h
  locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs
  futex: Remove unused or redundant includes
  futex: Consistently use fshared as boolean
  futex: Remove needless goto's
  futex: Remove put_futex_key()
  rwsem: fix commas in initialisation
  docs: locking: Replace HTTP links with HTTPS ones
  ...
2020-08-03 14:39:35 -07:00
Linus Torvalds 5ece08178d A single commit that separates out the instrumentation_begin()/end() bits from compiler.h.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8n8LQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hvyA//akThywKHuyy9XDnHv4Llo6SgYmbxByX/
 krBPttId3feRwwiw4v+pbE9USQ4JAJUA0inciSJqO9xHFuz43h/QYQwdwShgGghS
 BvYVxYhLXVojIgxItZQ0XtxuRu2dpR1Rg4m1K6GXuAfdOiV+2C9fubIyDfSPa6n6
 82Sk/RgjXCklw6KB3evq5eRXPHHP4oSnvMRDD+SgPumDnsfwgcULvgHbLmr5yTHR
 0Zp1r4bkt0RsSpccyCTS48Kx4SuFZa8m6lsQLzEJw7y7ctukYiC7nP2uo+uZfGHg
 FwcRI2u0tWCQR4stc4upgBlB4cs84NlUXVHyy3zJeWcaBmHP5w4NkiEode+mtJIs
 Y/tovbcpCyHBFAnjeCD19GWZlx8pVtixusghcV2tchD+l/28stktIJZ2hgK59YYI
 TRd0F7SYhU0wQw8V3XdzjbKG6tXHKVNx7Y0Dj/arnHldN+WaD3ENpuPbpobDAtNg
 2EqbsUqDk4acoh8nBh7A0AJdm+C+ddLQqMo0kz9JvN3LXGQX/3Nvs2BuWB4TAlgm
 EIPaSa7c54TAletjwAGD39Si1TU26dSs5Jeaf84u2hsvBRM044mQNS9HzwLijysj
 ilW3N5zm4uAF1QOsWUvMUM65y1qFewibkt4v7yysmEJAaZWpWfkK7U57dTx3qTIj
 AQ1rsZxBg3s=
 =Lzcg
 -----END PGP SIGNATURE-----

Merge tag 'core-headers-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull header cleanup from Ingo Molnar:
 "Separate out the instrumentation_begin()/end() bits from compiler.h"

* tag 'core-headers-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  compiler.h: Move instrumentation_begin()/end() to new <linux/instrumentation.h> header
2020-08-03 14:25:40 -07:00
Linus Torvalds 3b4b84b2ea Fix a recent IRQ affinities regression, add in a missing debugfs printout
that helps the debugging of IRQ affinity logic bugs, and fix a memory leak.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8nEn8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ipYA/+KOWjDuRp1YBZeZ4/55RjGzimsW5jkLIY
 0Na3WGjN/QBKCzmRJNnMyW1UjRgpHpBhOsphTcHVdhJo9jg5+DX+XdVTwKGTqAI+
 7DqzP4dzifSgUwdcxIbKwtZquBRzKk1K0Z25b6Jc0WJwkGRx3LWhhRDERPUEHtXg
 Sl07XxiuqFLcQZz9o3hisKzEfA2llB4bfXOjLCJlLK3HUZKccoBjWKbTrI3ymCiz
 f0iV9a7kNzo4fJNddKOBTtDWFEhpj6NgEVtLNdAaDti7MSSjPbB1BsiK64UInGMQ
 4881ItYAOHGuCHe8yYnjlWA5kmwX14KjN6c3RAXK3n4+wvf+17RJC+FLH1PbkFIx
 hZ8k9x2Y5Dpt8vD8fGkoqi2nr2JYbIiOm79AjrD+Li+wWKG3iw4AGEoBHBJzKHUb
 naEGiUDJpn7pdpPWMACoctAIhy7/gDA1pPyb5F7Bf/RwoskIyu4i/d/xz05zBg3H
 HZMC2Lqcgh7LTS91NnmCx8XELdgL14mN19LK5enH3QTIPtdxmZ5x4quKw6ajMAAQ
 jwRpExqy6E1TQkIG5T5hjT0EMuj4uA6OzaoeOroFzKuzo+jiEDl49WAx+9Im9oBb
 i7hT4PM/wR7BcfmTMVhmmns4Dp0LkW7dRxHIjo7Fzft5iF8UkO7o7A4VUoAIxrSm
 xDFlBO/mo3w=
 =oKU9
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:
 "Fix a recent IRQ affinities regression, add in a missing debugfs
  printout that helps the debugging of IRQ affinity logic bugs, and fix
  a memory leak"

* tag 'irq-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/debugfs: Add missing irqchip flags
  genirq/affinity: Make affinity setting if activated opt-in
  irqdomain/treewide: Free firmware node after domain removal
2020-08-03 14:21:52 -07:00
Linus Torvalds ab5c60b79a Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Add support for allocating transforms on a specific NUMA Node
   - Introduce the flag CRYPTO_ALG_ALLOCATES_MEMORY for storage users

  Algorithms:
   - Drop PMULL based ghash on arm64
   - Fixes for building with clang on x86
   - Add sha256 helper that does the digest in one go
   - Add SP800-56A rev 3 validation checks to dh

  Drivers:
   - Permit users to specify NUMA node in hisilicon/zip
   - Add support for i.MX6 in imx-rngc
   - Add sa2ul crypto driver
   - Add BA431 hwrng driver
   - Add Ingenic JZ4780 and X1000 hwrng driver
   - Spread IRQ affinity in inside-secure and marvell/cesa"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (157 commits)
  crypto: sa2ul - Fix inconsistent IS_ERR and PTR_ERR
  hwrng: core - remove redundant initialization of variable ret
  crypto: x86/curve25519 - Remove unused carry variables
  crypto: ingenic - Add hardware RNG for Ingenic JZ4780 and X1000
  dt-bindings: RNG: Add Ingenic RNG bindings.
  crypto: caam/qi2 - add module alias
  crypto: caam - add more RNG hw error codes
  crypto: caam/jr - remove incorrect reference to caam_jr_register()
  crypto: caam - silence .setkey in case of bad key length
  crypto: caam/qi2 - create ahash shared descriptors only once
  crypto: caam/qi2 - fix error reporting for caam_hash_alloc
  crypto: caam - remove deadcode on 32-bit platforms
  crypto: ccp - use generic power management
  crypto: xts - Replace memcpy() invocation with simple assignment
  crypto: marvell/cesa - irq balance
  crypto: inside-secure - irq balance
  crypto: ecc - SP800-56A rev 3 local public key validation
  crypto: dh - SP800-56A rev 3 local public key validation
  crypto: dh - check validity of Z before export
  lib/mpi: Add mpi_sub_ui()
  ...
2020-08-03 10:40:14 -07:00
Rafael J. Wysocki c81b30c895 Merge branch 'pm-cpufreq'
* pm-cpufreq: (24 commits)
  cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode
  cpufreq: intel_pstate: Rearrange the storing of new EPP values
  cpufreq: intel_pstate: Avoid enabling HWP if EPP is not supported
  cpufreq: intel_pstate: Clean up aperf_mperf_shift description
  cpufreq: powernv: Make some symbols static
  cpufreq: amd_freq_sensitivity: Mark sometimes used ID structs as __maybe_unused
  cpufreq: intel_pstate: Supply struct attribute description for get_aperf_mperf_shift()
  cpufreq: pcc-cpufreq: Mark sometimes used ID structs as __maybe_unused
  cpufreq: powernow-k8: Mark 'hi' and 'lo' dummy variables as __always_unused
  cpufreq: acpi-cpufreq: Mark sometimes used ID structs as __maybe_unused
  cpufreq: acpi-cpufreq: Mark 'dummy' variable as __always_unused
  cpufreq: powernv-cpufreq: Fix a bunch of kerneldoc related issues
  cpufreq: pasemi: Include header file for {check,restore}_astate prototypes
  cpufreq: cpufreq_governor: Demote store_sampling_rate() header to standard comment block
  cpufreq: cpufreq: Demote lots of function headers unworthy of kerneldoc status
  cpufreq: freq_table: Demote obvious misuse of kerneldoc to standard comment blocks
  cpufreq: Replace HTTP links with HTTPS ones
  cpufreq: intel_pstate: Fix static checker warning for epp variable
  cpufreq: Remove the weakly defined cpufreq_default_governor()
  cpufreq: Specify default governor on command line
  ...
2020-08-03 13:12:36 +02:00
Ingo Molnar 992414a18c Merge branch 'locking/nmi' into locking/core, to pick up completed topic branch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-08-03 13:00:27 +02:00
Randy Dunlap 4e722d4fe2 xen: hypercall.h: fix duplicated word
Change the repeated word "as" to "as a".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: xen-devel@lists.xenproject.org
Link: https://lore.kernel.org/r/20200726001731.19540-1-rdunlap@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2020-08-03 07:48:39 +02:00
Linus Torvalds 5a30a78924 A single fix for a potential deadlock when printing a message about spurious interrupts.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8nECQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hKohAAquK72js6bDeqiOW2SB8Rkq7rRdghMsKn
 P6W888BJWVNLRDsMKLlChlDWWQVzrslHBEAvUBwWyGphFaJXmSLyR+270/lRy0Mr
 cNXaTOfbK7nlP7JwSQAknkhodpl44edH99xE8TeZIyHCVKNXPy4UDutl3kYo16Xt
 ZnS/beyAI3k7qVFDijwGnMYLcrsdiGbbcbdBuCRuTUAX7lRN/bQGLAFsw1o0hb0O
 o060kuj46ZIXITfcM9+2cQt294+gx4x0VSjLsuHZaCAAjESEMsHBgW8mFBthFMkV
 TgNd2TQ1k3k3bumbu3iA4HbcDop4isvscSRmgPx3gw9mbKyKu9DSPk0/Q6bItV+0
 CHgC0gHSgIGR2/e8k8cxfcZFnQ0Rm0sPcLuXoAbC8HaW+Ux+nuDbzFV9eZc7tzwn
 Khbns9efQLlAGj6dC8oA+T9tuHSAhbSfT/IpH5FY1ZaT5H9FG+Evt/GMBwDgMiTx
 bCx2H01lBYX+1VO+jLG9qZ/K1X/e14PEJkUsO/BoVKSvVo0V2/Yk3ayMNn2e9i/Y
 yVPZR3tNtNYXPdykqNmhRTsoC8qEkBdbqms4dKk8DtiUcKpn2swXI4AtG6mS93Et
 R/uazY1kaW+8N8ssdN7SJOxSeCZyxT8HDPl7Kig9+rxoI/rXdE78nKiiga+widIl
 Fi3ojRgYDgk=
 =deY3
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Ingo Molnar:
 "A single fix for a potential deadlock when printing a message about
  spurious interrupts"

* tag 'x86-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/i8259: Use printk_deferred() to prevent deadlock
2020-08-02 13:10:42 -07:00
Linus Torvalds 628e04dfeb Bugfixes and strengthening the validity checks on inputs from new userspace APIs.
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8mdWQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM48Qf/eUYXVdGQWErQ/BQR/pLPfTUkjkIj
 1C/IGogWijbWPQrtIsnXVG53eULgHocbfzExZ8PzUSfjuZ/g9V8aHGbrGKbg/y6g
 SVArpCTu+BDidmLMbjsAKh3f5SEbzZZuFKnoxoEEKiA2TeYqDQ05nxcv9+T4udrs
 DWqXnk27y0vR9RtJkf5sqDnyH36cDT9TsbRFPaCt/hFE0UA65UTtWgl1ZaagzmuL
 I70z5Ap+/6fsEMoKoys9AKSCpLRGUBu/42fQhZkEq9no0w5PBkzn/YMfapHwT9I+
 1i9WUv3gn+FQOXpqAg0+aaPtWwIBIEONbm0qmw4yNNUI24Btq2QjGWxRqg==
 =VQPu
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Bugfixes and strengthening the validity checks on inputs from new
  userspace APIs.

  Now I know why I shouldn't prepare pull requests on the weekend, it's
  hard to concentrate if your son is shouting about his latest Minecraft
  builds in your ear. Fortunately all the patches were ready and I just
  had to check the test results..."

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SVM: Fix disable pause loop exit/pause filtering capability on SVM
  KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled
  KVM: arm64: Don't inherit exec permission across page-table levels
  KVM: arm64: Prevent vcpu_has_ptrauth from generating OOL functions
  KVM: nVMX: check for invalid hdr.vmx.flags
  KVM: nVMX: check for required but missing VMCS12 in KVM_SET_NESTED_STATE
  selftests: kvm: do not set guest mode flag
2020-08-02 10:41:00 -07:00
Dan Carpenter ff2bd9ff11 KVM: SVM: Fix sev_pin_memory() error handling
The sev_pin_memory() function was modified to return error pointers
instead of NULL but there are two problems.  The first problem is that
if "npages" is zero then it still returns NULL.  Secondly, several of
the callers were not updated to check for error pointers instead of
NULL.

Either one of these issues will lead to an Oops.

Fixes: a8d908b587 ("KVM: x86: report sev_pin_memory errors with PTR_ERR")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <20200714142351.GA315374@mwanda>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-02 05:23:06 -04:00
Ingo Molnar 63722bbca6 Merge branch 'kcsan' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into locking/core
Pull v5.9 KCSAN bits from Paul E. McKenney.

Perhaps the most important change is that GCC 11 now has all fixes in place
to support KCSAN, so GCC support can be enabled again.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-08-01 09:26:27 +02:00
Ingo Molnar 28cff52eae Merge branch 'linus' into locking/core, to resolve conflict
Conflicts:
	arch/arm/include/asm/percpu.h

As Stephen Rothwell noted, there's a conflict between this commit
in locking/core:

  a21ee6055c ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")

and this fresh upstream commit:

  aa54ea903a ("ARM: percpu.h: fix build error")

a21ee6055c is a simpler solution to the dependency problem and doesn't
further increase header hell - so this conflict resolution effectively
reverts aa54ea903a and uses the a21ee6055c solution.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 12:16:09 +02:00
Ingo Molnar adb334d178 Merge branch 'WIP.x86/entry' into x86/entry, to merge the latest generic code and resolve conflicts
Pick up and resolve the NMI entry code changes from the locking tree,
and also pick up the latest two fixes from tip:core/entry.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 12:13:21 +02:00
Nick Terrell fb46d057db x86: Add support for ZSTD compressed kernel
- Add support for zstd compressed kernel

- Define __DISABLE_EXPORTS in Makefile

- Remove __DISABLE_EXPORTS definition from kaslr.c

- Bump the heap size for zstd.

- Update the documentation.

Integrates the ZSTD decompression code to the x86 pre-boot code.

Zstandard requires slightly more memory during the kernel decompression
on x86 (192 KB vs 64 KB), and the memory usage is independent of the
window size.

__DISABLE_EXPORTS is now defined in the Makefile, which covers both
the existing use in kaslr.c, and the use needed by the zstd decompressor
in misc.c.

This patch has been boot tested with both a zstd and gzip compressed
kernel on i386 and x86_64 using buildroot and QEMU.

Additionally, this has been tested in production on x86_64 devices.
We saw a 2 second boot time reduction by switching kernel compression
from xz to zstd.

Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200730190841.2071656-7-nickrterrell@gmail.com
2020-07-31 11:49:09 +02:00
Nick Terrell 0fe4f4ef8c x86: Bump ZO_z_extra_bytes margin for zstd
Bump the ZO_z_extra_bytes margin for zstd.

Zstd needs 3 bytes per 128 KB, and has a 22 byte fixed overhead.
Zstd needs to maintain 128 KB of space at all times, since that is
the maximum block size. See the comments regarding in-place
decompression added in lib/decompress_unzstd.c for details.

The existing code is written so that all the compression algorithms use
the same ZO_z_extra_bytes. It is taken to be the maximum of the growth
rate plus the maximum fixed overhead. The comments just above this diff
state that:

Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200730190841.2071656-6-nickrterrell@gmail.com
2020-07-31 11:49:08 +02:00
Arvind Sankar f49236ae42 x86/kaslr: Add a check that the random address is in range
Check in find_random_phys_addr() that the chosen address is inside the
range that was required.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-22-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 0eb1a8af01 x86/kaslr: Make local variables 64-bit
Change the type of local variables/fields that store mem_vector
addresses to u64 to make it less likely that 32-bit overflow will cause
issues on 32-bit.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-21-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 3a066990a3 x86/kaslr: Replace 'unsigned long long' with 'u64'
No functional change.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-20-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar e4cb955bf1 x86/kaslr: Make minimum/image_size 'unsigned long'
Change type of minimum/image_size arguments in process_mem_region to
'unsigned long'. These actually can never be above 4G (even on x86_64),
and they're 'unsigned long' in every other function except this one.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-19-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 4268b4da57 x86/kaslr: Small cleanup of find_random_phys_addr()
Just a trivial rearrangement to do all the processing together, and only
have one call to slots_fetch_random() in the source.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-18-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar eb38be6db5 x86/kaslr: Drop unnecessary alignment in find_random_virt_addr()
Drop unnecessary alignment of image_size to CONFIG_PHYSICAL_ALIGN in
find_random_virt_addr, it cannot change the result: the largest valid
slot is the largest n that satisfies

  minimum + n * CONFIG_PHYSICAL_ALIGN + image_size <= KERNEL_IMAGE_SIZE

(since minimum is already aligned) and so n is equal to

  (KERNEL_IMAGE_SIZE - minimum - image_size) / CONFIG_PHYSICAL_ALIGN

even if image_size is not aligned to CONFIG_PHYSICAL_ALIGN.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-17-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 46a5b29a4a x86/kaslr: Drop redundant check in store_slot_info()
Drop unnecessary check that number of slots is not zero in
store_slot_info, it's guaranteed to be at least 1 by the calculation.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-16-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar d6d0f36c73 x86/kaslr: Make the type of number of slots/slot areas consistent
The number of slots can be 'unsigned int', since on 64-bit, the maximum
amount of memory is 2^52, the minimum alignment is 2^21, so the slot
number cannot be greater than 2^31. But in case future processors have
more than 52 physical address bits, make it 'unsigned long'.

The slot areas are limited by MAX_SLOT_AREA, currently 100. It is
indexed by an int, but the number of areas is stored as 'unsigned long'.
Change both to 'unsigned int' for consistency.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-15-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 3870d97179 x86/kaslr: Drop test for command-line parameters before parsing
This check doesn't save anything. In the case when none of the
parameters are present, each strstr will scan args twice (once to find
the length and then for searching), six scans in total. Just going ahead
and parsing the arguments only requires three scans: strlen, memcpy, and
parsing. This will be the first malloc, so free will actually free up
the memory, so the check doesn't save heap space either.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-14-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar be9e8d9541 x86/kaslr: Simplify process_gb_huge_pages()
Replace the loop to determine the number of 1Gb pages with arithmetic.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-13-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 50def2693a x86/kaslr: Short-circuit gb_huge_pages on x86-32
32-bit does not have GB pages, so don't bother checking for them. Using
the IS_ENABLED() macro allows the compiler to completely remove the
gb_huge_pages code.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-12-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 79c2fd2afe x86/kaslr: Fix off-by-one error in process_gb_huge_pages()
If the remaining size of the region is exactly 1Gb, there is still one
hugepage that can be reserved.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-11-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar bf457be154 x86/kaslr: Drop some redundant checks from __process_mem_region()
Clip the start and end of the region to minimum and mem_limit prior to
the loop. region.start can only increase during the loop, so raising it
to minimum before the loop is enough.

A region that becomes empty due to this will get checked in
the first iteration of the loop.

Drop the check for overlap extending beyond the end of the region. This
will get checked in the next loop iteration anyway.

Rename end to region_end for symmetry with region.start.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-10-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar ef7b07d59e x86/kaslr: Drop redundant variable in __process_mem_region()
region.size can be trimmed to store the portion of the region before the
overlap, instead of a separate mem_vector variable.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-9-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar ee435ee649 x86/kaslr: Eliminate 'start_orig' local variable from __process_mem_region()
Set the region.size within the loop, which removes the need for
start_orig.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-8-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 3f9412c730 x86/kaslr: Drop redundant cur_entry from __process_mem_region()
cur_entry is only used as cur_entry.start + cur_entry.size, which is
always equal to end.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-7-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 8d1cf85958 x86/kaslr: Fix off-by-one error in __process_mem_region()
In case of an overlap, the beginning of the region should be used even
if it is exactly image_size, not just strictly larger.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200728225722.67457-6-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 451286940d x86/kaslr: Initialize mem_limit to the real maximum address
On 64-bit, the kernel must be placed below MAXMEM (64TiB with 4-level
paging or 4PiB with 5-level paging). This is currently not enforced by
KASLR, which thus implicitly relies on physical memory being limited to
less than 64TiB.

On 32-bit, the limit is KERNEL_IMAGE_SIZE (512MiB). This is enforced by
special checks in __process_mem_region().

Initialize mem_limit to the maximum (depending on architecture), instead
of ULLONG_MAX, and make sure the command-line arguments can only
decrease it. This makes the enforcement explicit on 64-bit, and
eliminates the 32-bit specific checks to keep the kernel below 512M.

Check upfront to make sure the minimum address is below the limit before
doing any work.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200727230801.3468620-5-nivedita@alum.mit.edu
2020-07-31 11:08:17 +02:00
Arvind Sankar 0870536556 x86/kaslr: Fix process_efi_entries comment
Since commit:

  0982adc746 ("x86/boot/KASLR: Work around firmware bugs by excluding EFI_BOOT_SERVICES_* and EFI_LOADER_* from KASLR's choice")

process_efi_entries() will return true if we have an EFI memmap, not just
if it contained EFI_MEMORY_MORE_RELIABLE regions.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200727230801.3468620-4-nivedita@alum.mit.edu
2020-07-31 11:08:12 +02:00
Arvind Sankar e2ee617316 x86/kaslr: Remove bogus warning and unnecessary goto
Drop the warning on seeing "--" in handle_mem_options(). This will trigger
whenever one of the memory options is present in the command line
together with "--", but there's no problem if that is the case.

Replace goto with break.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200727230801.3468620-3-nivedita@alum.mit.edu
2020-07-31 11:08:07 +02:00
Arvind Sankar 709709ac64 x86/kaslr: Make command line handling safer
Handle the possibility that the command line is NULL.

Replace open-coded strlen with a function call.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200727230801.3468620-2-nivedita@alum.mit.edu
2020-07-31 11:07:51 +02:00
Herbert Xu 054a5540fb crypto: x86/curve25519 - Remove unused carry variables
The carry variables are assigned but never used, which upsets
the compiler.  This patch removes them.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Karthikeyan Bhargavan <karthik.bhargavan@gmail.com>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-31 18:25:29 +10:00
Wanpeng Li a445fc457d KVM: LAPIC: Set the TDCR settable bits
It is a little different between Intel and AMD, Intel's bit 2
is 0 and AMD is reserved. On bare-metal, Intel will refuse to set
APIC_TDCR once bits except 0, 1, 3 are setting, however, AMD will
accept bits 0, 1, 3 and ignore other bits setting as patch does.
Before the patch, we can get back anything what we set to the
APIC_TDCR, this patch improves it.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1596165141-28874-2-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-31 03:21:03 -04:00
Wanpeng Li 830f01b089 KVM: SVM: Fix disable pause loop exit/pause filtering capability on SVM
'Commit 8566ac8b8e ("KVM: SVM: Implement pause loop exit logic in SVM")'
drops disable pause loop exit/pause filtering capability completely, I
guess it is a merge fault by Radim since disable vmexits capabilities and
pause loop exit for SVM patchsets are merged at the same time. This patch
reintroduces the disable pause loop exit/pause filtering capability support.

Reported-by: Haiwei Li <lihaiwei@tencent.com>
Tested-by: Haiwei Li <lihaiwei@tencent.com>
Fixes: 8566ac8b ("KVM: SVM: Implement pause loop exit logic in SVM")
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1596165141-28874-3-git-send-email-wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-31 03:20:32 -04:00
Wanpeng Li d2286ba7d5 KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled
Prevent setting the tscdeadline timer if the lapic is hw disabled.

Fixes: bce87cce88 (KVM: x86: consolidate different ways to test for in-kernel LAPIC)
Cc: <stable@vger.kernel.org>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1596165141-28874-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-31 03:20:15 -04:00
Sean Christopherson 83013059bd KVM: x86: Specify max TDP level via kvm_configure_mmu()
Capture the max TDP level during kvm_configure_mmu() instead of using a
kvm_x86_ops hook to do it at every vCPU creation.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200716034122.5998-10-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-30 18:18:08 -04:00
Sean Christopherson 1d92d2e8e7 KVM: x86/mmu: Rename max_page_level to max_huge_page_level
Rename max_page_level to explicitly call out that it tracks the max huge
page level so as to avoid confusion when a future patch moves the max
TDP level, i.e. max root level, into the MMU and kvm_configure_mmu().

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200716034122.5998-9-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-30 18:17:58 -04:00
Sean Christopherson d468d94b7b KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR
Calculate the desired TDP level on the fly using the max TDP level and
MAXPHYADDR instead of doing the same when CPUID is updated.  This avoids
the hidden dependency on cpuid_maxphyaddr() in vmx_get_tdp_level() and
also standardizes the "use 5-level paging iff MAXPHYADDR > 48" behavior
across x86.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200716034122.5998-8-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-30 18:17:16 -04:00
Sean Christopherson f83a4a6932 KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch
Remove the WARN in vmx_load_mmu_pgd() that was temporarily added to aid
bisection/debug in the event the current MMU's shadow root level didn't
match VMX's computed EPTP level.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200716034122.5998-7-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-30 18:17:00 -04:00
Sean Christopherson 2a40b9001e KVM: x86: Pull the PGD's level from the MMU instead of recalculating it
Use the shadow_root_level from the current MMU as the root level for the
PGD, i.e. for VMX's EPTP.  This eliminates the weird dependency between
VMX and the MMU where both must independently calculate the same root
level for things to work correctly.  Temporarily keep VMX's calculation
of the level and use it to WARN if the incoming level diverges.

Opportunistically refactor kvm_mmu_load_pgd() to avoid indentation hell,
and rename a 'cr3' param in the load_mmu_pgd prototype that managed to
survive the cr3 purge.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200716034122.5998-6-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-30 18:16:47 -04:00
Sean Christopherson 812f805836 KVM: VMX: Make vmx_load_mmu_pgd() static
Make vmx_load_mmu_pgd() static as it is no longer invoked directly by
nested VMX (or any code for that matter).

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200716034122.5998-5-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-30 18:14:50 -04:00
Sean Christopherson 59505b55aa KVM: x86/mmu: Add separate helper for shadow NPT root page role calc
Refactor the shadow NPT role calculation into a separate helper to
better differentiate it from the non-nested shadow MMU, e.g. the NPT
variant is never direct and derives its root level from the TDP level.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200716034122.5998-3-sean.j.christopherson@intel.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-30 18:14:34 -04:00
Sean Christopherson f291a358e0 KVM: VMX: Drop a duplicate declaration of construct_eptp()
Remove an extra declaration of construct_eptp() from vmx.h.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200716034122.5998-4-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-30 18:13:47 -04:00
Sean Christopherson 096586fda5 KVM: nSVM: Correctly set the shadow NPT root level in its MMU role
Move the initialization of shadow NPT MMU's shadow_root_level into
kvm_init_shadow_npt_mmu() and explicitly set the level in the shadow NPT
MMU's role to be the TDP level.  This ensures the role and MMU levels
are synchronized and also initialized before __kvm_mmu_new_pgd(), which
consumes the level when attempting a fast PGD switch.

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Fixes: 9fa72119b2 ("kvm: x86: Introduce kvm_mmu_calc_root_page_role()")
Fixes: a506fdd223 ("KVM: nSVM: implement nested_svm_load_cr3() and use it for host->guest switch")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200716034122.5998-2-sean.j.christopherson@intel.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-30 18:13:23 -04:00
Thomas Gleixner f3020b8891 x86/kvm: Use __xfer_to_guest_mode_work_pending() in kvm_run_vcpu()
The comments explicitely explain that the work flags check and handling in
kvm_run_vcpu() is done with preemption and interrupts enabled as KVM
invokes the check again right before entering guest mode with interrupts
disabled which guarantees that the work flags are observed and handled
before VMENTER.

Nevertheless the flag pending check in kvm_run_vcpu() uses the helper
variant which requires interrupts to be disabled triggering an instant
lockdep splat. This was caught in testing before and then not fixed up in
the patch before applying. :(

Use the relaxed and intentionally racy __xfer_to_guest_mode_work_pending()
instead.

Fixes: 72c3c0fe54 ("x86/kvm: Use generic xfer to guest work function")
Reported-by: Qian Cai <cai@lca.pw> writes:
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/87bljxa2sa.fsf@nanos.tec.linutronix.de
2020-07-30 12:31:47 +02:00
Christoph Hellwig c8376994c8 initrd: remove support for multiple floppies
Remove the special handling for multiple floppies in the initrd code.
No one should be using floppies for booting these days. (famous last
words..)

Includes a spelling fix from Colin Ian King <colin.king@canonical.com>.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-30 08:22:33 +02:00
Thomas Gleixner bdd6558959 x86/i8259: Use printk_deferred() to prevent deadlock
0day reported a possible circular locking dependency:

Chain exists of:
  &irq_desc_lock_class --> console_owner --> &port_lock_key

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&port_lock_key);
                               lock(console_owner);
                               lock(&port_lock_key);
  lock(&irq_desc_lock_class);

The reason for this is a printk() in the i8259 interrupt chip driver
which is invoked with the irq descriptor lock held, which reverses the
lock operations vs. printk() from arbitrary contexts.

Switch the printk() to printk_deferred() to avoid that.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87365abt2v.fsf@nanos.tec.linutronix.de
2020-07-29 16:27:16 +02:00
Peter Zijlstra f05d67179d Merge branch 'locking/header' 2020-07-29 16:14:21 +02:00
Herbert Xu 7ca8cf5347 locking/atomic: Move ATOMIC_INIT into linux/types.h
This patch moves ATOMIC_INIT from asm/atomic.h into linux/types.h.
This allows users of atomic_t to use ATOMIC_INIT without having to
include atomic.h as that way may lead to header loops.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20200729123105.GB7047@gondor.apana.org.au
2020-07-29 16:14:18 +02:00
Joerg Roedel 56fbacc9bf Merge branches 'arm/renesas', 'arm/qcom', 'arm/mediatek', 'arm/omap', 'arm/exynos', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd' and 'core' into next 2020-07-29 14:42:00 +02:00
Pu Wen d903b6d029 perf/x86/rapl: Add Hygon Fam18h RAPL support
Hygon Family 18h(Dhyana) support RAPL in bit 14 of CPUID 0x80000007 EDX,
and has MSRs RAPL_PWR_UNIT/CORE_ENERGY_STAT/PKG_ENERGY_STAT. So add Hygon
Dhyana Family 18h support for RAPL.

The output is available via the energy-pkg pseudo event:

  $ perf stat -a -I 1000 --per-socket -e power/energy-pkg/

[ mingo: Tidied up the initializers. ]

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200720082205.1307-1-puwen@hygon.cn
2020-07-28 13:34:20 +02:00
Ingo Molnar e89d4ca1df Linux 5.8-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8d8h4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGd0sH/2iktYhMwPxzzpnb
 eI3OuTX/mRn4vUFOfpx9dmGVleMfKkpbvnn3IY7wA62Qfv7J7lkFRa1Bd1DlqXfW
 yyGTGDSKG5chiRCOU3s9ni92M4xIzFlrojyt/dIK2lUGMzUPI9FGlZRGQLKqqwLh
 2syOXRWbcQ7e52IHtDSy3YBNveKRsP4NyqV+GxGiex18SMB/M3Pw9EMH614eDPsE
 QAGQi5uGv4hPJtFHgXgUyBPLFHIyFAiVxhFRIj7u2DSEKY79+wO1CGWFiFvdTY4B
 CbqKXLffY3iQdFsLJkj9Dl8cnOQnoY44V0EBzhhORxeOp71StUVaRwQMFa5tp48G
 171s5Hs=
 =BQIl
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc7' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-28 13:18:01 +02:00
Al Viro 0557d64d98 x86: switch to ->regset_get()
All instances of ->get() in arch/x86 switched; that might or might
not be worth splitting up.  Notes:

	* for xstateregs_get() the amount we want to store is determined at
the boot time; see init_xstate_size() and update_regset_xstate_info() for
details.  task->thread.fpu.state.xsave ends with a flexible array member and
the amount of data in it depends upon the FPU features supported/enabled.

	* fpregs_get() writes slightly less than full ->thread.fpu.state.fsave
(the last word is not copied); we pass the full size of state.fsave and let
membuf_write() trim to the amount declared by regset - __regset_get() will
make sure that the space in buffer is no more than that.

	* copy_xstate_to_user() and its helpers are gone now.

	* fpregs_soft_get() was getting user_regset_copyout() arguments
wrong.  Since "x86: x86 user_regset math_emu" back in 2008...  I really
doubt that it's worth splitting out for -stable, though - you need
a 486SX box for that to trigger...

[Kevin's braino fix for copy_xstate_to_kernel() essentially duplicated here]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-07-27 14:31:07 -04:00
Al Viro 7a896028ad kill elf_fpxregs_t
all uses are conditional upon ELF_CORE_COPY_XFPREGS, which has not
been defined on any architecture since 2010

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-07-27 14:29:23 -04:00
Haiwei Li 9c2475f3e4 KVM: Using macros instead of magic values
Instead of using magic values, use macros.

Signed-off-by: Haiwei Li <lihaiwei@tencent.com>
Message-Id: <4c072161-80dd-b7ed-7adb-02acccaa0701@gmail.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-27 10:40:25 -04:00
Thomas Gleixner f0c7baca18 genirq/affinity: Make affinity setting if activated opt-in
John reported that on a RK3288 system the perf per CPU interrupts are all
affine to CPU0 and provided the analysis:

 "It looks like what happens is that because the interrupts are not per-CPU
  in the hardware, armpmu_request_irq() calls irq_force_affinity() while
  the interrupt is deactivated and then request_irq() with IRQF_PERCPU |
  IRQF_NOBALANCING.  

  Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls
  irq_setup_affinity() which returns early because IRQF_PERCPU and
  IRQF_NOBALANCING are set, leaving the interrupt on its original CPU."

This was broken by the recent commit which blocked interrupt affinity
setting in hardware before activation of the interrupt. While this works in
general, it does not work for this particular case. As contrary to the
initial analysis not all interrupt chip drivers implement an activate
callback, the safe cure is to make the deferred interrupt affinity setting
at activation time opt-in.

Implement the necessary core logic and make the two irqchip implementations
for which this is required opt-in. In hindsight this would have been the
right thing to do, but ...

Fixes: baedb87d1b ("genirq/affinity: Handle affinity setting on inactive interrupts correctly")
Reported-by: John Keeping <john@metanate.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de
2020-07-27 16:20:40 +02:00
peterz@infradead.org ed00495333 locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs
Prior to commit:

  859d069ee1 ("lockdep: Prepare for NMI IRQ state tracking")

IRQ state tracking was disabled in NMIs due to nmi_enter()
doing lockdep_off() -- with the obvious requirement that NMI entry
call nmi_enter() before trace_hardirqs_off().

[ AFAICT, PowerPC and SH violate this order on their NMI entry ]

However, that commit explicitly changed lockdep_hardirqs_*() to ignore
lockdep_off() and breaks every architecture that has irq-tracing in
it's NMI entry that hasn't been fixed up (x86 being the only fixed one
at this point).

The reason for this change is that by ignoring lockdep_off() we can:

  - get rid of 'current->lockdep_recursion' in lockdep_assert_irqs*()
    which was going to to give header-recursion issues with the
    seqlock rework.

  - allow these lockdep_assert_*() macros to function in NMI context.

Restore the previous state of things and allow an architecture to
opt-in to the NMI IRQ tracking support, however instead of relying on
lockdep_off(), rely on in_nmi(), both are part of nmi_enter() and so
over-all entry ordering doesn't need to change.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200727124852.GK119549@hirez.programming.kicks-ass.net
2020-07-27 15:13:29 +02:00
Paolo Bonzini 5e105c88ab KVM: nVMX: check for invalid hdr.vmx.flags
hdr.vmx.flags is meant for future extensions to the ABI, rejecting
invalid flags is necessary to avoid broken half-loads of the
nVMX state.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-27 09:04:50 -04:00
Paolo Bonzini 0f02bd0ade KVM: nVMX: check for required but missing VMCS12 in KVM_SET_NESTED_STATE
A missing VMCS12 was not causing -EINVAL (it was just read with
copy_from_user, so it is not a security issue, but it is still
wrong).  Test for VMCS12 validity and reject the nested state
if a VMCS12 is required but not present.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-27 09:04:49 -04:00
Ricardo Neri f69ca629d8 x86/cpu: Refactor sync_core() for readability
Instead of having #ifdef/#endif blocks inside sync_core() for X86_64 and
X86_32, implement the new function iret_to_self() with two versions.

In this manner, avoid having to use even more more #ifdef/#endif blocks
when adding support for SERIALIZE in sync_core().

Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200727043132.15082-4-ricardo.neri-calderon@linux.intel.com
2020-07-27 12:42:06 +02:00
Ricardo Neri 9998a9832c x86/cpu: Relocate sync_core() to sync_core.h
Having sync_core() in processor.h is problematic since it is not possible
to check for hardware capabilities via the *cpu_has() family of macros.
The latter needs the definitions in processor.h.

It also looks more intuitive to relocate the function to sync_core.h.

This changeset does not make changes in functionality.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20200727043132.15082-3-ricardo.neri-calderon@linux.intel.com
2020-07-27 12:42:06 +02:00
Ricardo Neri 85b23fbc7d x86/cpufeatures: Add enumeration for SERIALIZE instruction
The Intel architecture defines a set of Serializing Instructions (a
detailed definition can be found in Vol.3 Section 8.3 of the Intel "main"
manual, SDM). However, these instructions do more than what is required,
have side effects and/or may be rather invasive. Furthermore, some of
these instructions are only available in kernel mode or may cause VMExits.
Thus, software using these instructions only to serialize execution (as
defined in the manual) must handle the undesired side effects.

As indicated in the name, SERIALIZE is a new Intel architecture
Serializing Instruction. Crucially, it does not have any of the mentioned
side effects. Also, it does not cause VMExit and can be used in user mode.

This new instruction is currently documented in the latest "extensions"
manual (ISE). It will appear in the "main" manual in the future.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20200727043132.15082-2-ricardo.neri-calderon@linux.intel.com
2020-07-27 12:42:06 +02:00
Ingo Molnar 538b10856b Linux 5.8-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8d8h4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGd0sH/2iktYhMwPxzzpnb
 eI3OuTX/mRn4vUFOfpx9dmGVleMfKkpbvnn3IY7wA62Qfv7J7lkFRa1Bd1DlqXfW
 yyGTGDSKG5chiRCOU3s9ni92M4xIzFlrojyt/dIK2lUGMzUPI9FGlZRGQLKqqwLh
 2syOXRWbcQ7e52IHtDSy3YBNveKRsP4NyqV+GxGiex18SMB/M3Pw9EMH614eDPsE
 QAGQi5uGv4hPJtFHgXgUyBPLFHIyFAiVxhFRIj7u2DSEKY79+wO1CGWFiFvdTY4B
 CbqKXLffY3iQdFsLJkj9Dl8cnOQnoY44V0EBzhhORxeOp71StUVaRwQMFa5tp48G
 171s5Hs=
 =BQIl
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc7' into x86/cpu, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-27 12:41:13 +02:00
Rafael J. Wysocki 80e3036866 Merge back cpufreq material for v5.9. 2020-07-27 12:34:55 +02:00
Joerg Roedel 2b32ab031e x86/mm/64: Make sync_global_pgds() static
The function is only called from within init_64.c and can be static.
Also remove it from pgtable_64.h.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20200721095953.6218-4-joro@8bytes.org
2020-07-27 12:32:29 +02:00
Joerg Roedel 8bb9bf242d x86/mm/64: Do not sync vmalloc/ioremap mappings
Remove the code to sync the vmalloc and ioremap ranges for x86-64. The
page-table pages are all pre-allocated now so that synchronization is
no longer necessary.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20200721095953.6218-3-joro@8bytes.org
2020-07-27 12:32:29 +02:00
Joerg Roedel 6eb82f9940 x86/mm: Pre-allocate P4D/PUD pages for vmalloc area
Pre-allocate the page-table pages for the vmalloc area at the level
which needs synchronization on x86-64, which is P4D for 5-level and
PUD for 4-level paging.

Doing this at boot makes sure no synchronization of that area is
necessary at runtime. The synchronization takes the pgd_lock and
iterates over all page-tables in the system, so it can take quite long
and is better avoided.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20200721095953.6218-2-joro@8bytes.org
2020-07-27 12:32:29 +02:00
Colin Ian King 90fc73928f x86/ioperm: Initialize pointer bitmap with NULL rather than 0
The pointer bitmap is being initialized with a plain integer 0,
fix this by initializing it with a NULL instead.

Cleans up sparse warning:
arch/x86/xen/enlighten_pv.c:876:27: warning: Using plain integer
as NULL pointer

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20200721100217.407975-1-colin.king@canonical.com
2020-07-26 19:52:55 +02:00
Ingo Molnar 2d65685a4a Merge branch 'x86/urgent' into x86/cleanups
Refresh the branch for a dependent commit.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-26 19:52:30 +02:00
Ingo Molnar f87032aec4 Merge branch 'locking/nmi' into x86/entry
Resolve conflicts with ongoing lockdep work that fixed the NMI entry code.

Conflicts:
	arch/x86/entry/common.c
	arch/x86/include/asm/idtentry.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-26 18:26:25 +02:00
Randy Dunlap de0038bfaf x86: uv: uv_hub.h: Delete duplicated word
Delete the repeated word "the".

[ mingo: While at it, also capitalize CPU properly. ]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200726004124.20618-4-rdunlap@infradead.org
2020-07-26 12:47:22 +02:00
Randy Dunlap 8b9fd48eb7 x86: cmpxchg_32.h: Delete duplicated word
Delete the repeated word "you".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200726004124.20618-3-rdunlap@infradead.org
2020-07-26 12:47:22 +02:00
Randy Dunlap ddeddd0811 x86: bootparam.h: Delete duplicated word
Delete the repeated word "for".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200726004124.20618-2-rdunlap@infradead.org
2020-07-26 12:47:22 +02:00
David S. Miller a57066b1a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The UDP reuseport conflict was a little bit tricky.

The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.

At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.

This requires moving the reuseport_has_conns() logic into the callers.

While we are here, get rid of inline directives as they do not belong
in foo.c files.

The other changes were cases of more straightforward overlapping
modifications.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25 17:49:04 -07:00
Linus Torvalds fbe0d451bc Misc fixes:
- Fix a section end page alignment assumption that was causing crashes
 
  - Fix ORC unwinding on freshly forked tasks which haven't executed yet
    and which have empty user task stacks
 
  - Fix the debug.exception-trace=1 sysctl dumping of user stacks, which
    was broken by recent maccess changes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8cGlwRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hVGxAAhW5LoNLv6BsMtKrFhBJRzPhoO8YcB8ep
 Z/MaaiEcS8kcSVx1nQrNNhqzwIGBlexHTyoddYQr/LeM3ahOoXc4+jZI7Es6WKdc
 Rhofalp0pryJrqj52orAVAUeHikmiqmadANKKmUNtXcjWWoxexgab28gLwJPPA55
 SKw1/tqQ6JTf/M6tLnalib/VcigN7F/r797LgqLan9pSkjMGj6f1zMFmqTsNDi05
 Si5gqRv2GlAraelRQU7emMZGu1bXhfJvGix7BB1/u6nRimY15vIFtccgt4akJmFN
 tBUwNHFeLrYsoX1txjMTtGzASBqtjaD6A1K2geqmuZ1lRL1YuGnUl7n4ccgAvQdJ
 eY6iiLiXDcjZLZXYe5tfdV/4p3HZ3MYm/IRkaw2KfiGAIv0Gii/U0mShO+lX1PAZ
 BdelgJv2fcOxDiNe0RM4EniBXQUcqDIOpQ1fPryYsFLeeA1LCK2/b8WLOPHuK8YG
 oxAgU9vV7OA7nfBdxIPszvX1jXOiQPglsW2gfxlKbGAA9j3uKDGhGnmFFYatYUMF
 OEOf/5iuw3MZE/+gAG685ksk4ibgW4Kv6ahAEdzs5hh37uM8SiC3oWVWQf2YWGHG
 ITswqKY2oUqghWa3ftXEtPLKstsYbRG7KCRfBXh6AN2ay7JrDJg6sItCL4UI5BjK
 dkDJ4pKLXug=
 =oamN
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master

Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - Fix a section end page alignment assumption that was causing
     crashes

   - Fix ORC unwinding on freshly forked tasks which haven't executed
     yet and which have empty user task stacks

   - Fix the debug.exception-trace=1 sysctl dumping of user stacks,
     which was broken by recent maccess changes"

* tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/dumpstack: Dump user space code correctly again
  x86/stacktrace: Fix reliable check for empty user task stacks
  x86/unwind/orc: Fix ORC for newly forked tasks
  x86, vmlinux.lds: Page-align end of ..page_aligned sections
2020-07-25 14:25:47 -07:00
Ingo Molnar c84d53051f Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc6' into locking/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-25 21:49:36 +02:00
Hayato Ohhashi 898ec52d2b x86/xen/time: Set the X86_FEATURE_TSC_KNOWN_FREQ flag in xen_tsc_khz()
If the TSC frequency is known from the pvclock page,
the TSC frequency does not need to be recalibrated.

We can avoid recalibration by setting X86_FEATURE_TSC_KNOWN_FREQ.

Signed-off-by: Hayato Ohhashi <o8@vmm.dev>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20200721161231.6019-1-o8@vmm.dev
2020-07-25 15:49:41 +02:00
Fenghua Yu 3aae57f0c3 x86/split_lock: Enable the split lock feature on Sapphire Rapids and Alder Lake CPUs
Add Sapphire Rapids and Alder Lake processors to CPU list to enumerate
and enable the split lock feature.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/1595634320-79689-1-git-send-email-fenghua.yu@intel.com
2020-07-25 12:17:00 +02:00
Tony Luck e00b62f0b0 x86/cpu: Add Lakefield, Alder Lake and Rocket Lake models to the to Intel CPU family
Add three new Intel CPU models.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200721043749.31567-1-tony.luck@intel.com
2020-07-25 12:16:59 +02:00
Ingo Molnar fb4405ae6e Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc6' into x86/cpu, to refresh the branch before adding new commits

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-25 12:16:16 +02:00
Ingo Molnar 1d0e12fd3a x86/defconfigs: Refresh defconfig files
Perform a 'make savedefconfig' pass over our main defconfig files,
which keeps the defconfig result the same, but compresses
the file where defaults were changed, options removed or
reordered.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200724130638.645844-2-mingo@kernel.org
2020-07-25 12:02:14 +02:00
Ingo Molnar 4b8e0328e5 x86/mm: Remove the unused mk_kernel_pgd() #define
AFAICS the last uses of directly 'making' kernel PGDs was removed 7 years ago:

  8b78c21d72d9: ("x86, 64bit, mm: hibernate use generic mapping_init")

Where the explicit PGD walking loop was replaced with kernel_ident_mapping_init()
calls. This was then (unnecessarily) carried over through the 5-level paging conversion.

Also clean up the 'level' comments a bit, to convey the original, meanwhile somewhat
bit-rotten notion, that these are empty comment blocks with no methods to handle any
of the levels except the PTE level.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200724114418.629021-4-mingo@kernel.org
2020-07-25 12:00:57 +02:00
Ingo Molnar 161449bad5 x86/tsc: Remove unused "US_SCALE" and "NS_SCALE" leftover macros
Last use of them was removed 13 years ago, when the code was converted
to use CYC2NS_SCALE_FACTOR:

  53d517cdbaac: ("x86: scale cyc_2_nsec according to CPU frequency")

The current TSC code uses the 'struct cyc2ns_data' scaling abstraction,
the old fixed scaling approach is long gone.

This cleanup also removes the 'arbitralrily' typo from the comment,
so win-win. ;-)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200724114418.629021-3-mingo@kernel.org
2020-07-25 12:00:57 +02:00
Ingo Molnar 8cd591aeb1 x86/ioapic: Remove unused "IOAPIC_AUTO" define
Last use was removed more than 5 years ago, in:

   5ad274d41c1b: ("x86/irq: Remove unused old IOAPIC irqdomain interfaces")

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200724114418.629021-2-mingo@kernel.org
2020-07-25 12:00:56 +02:00
Colin Ian King 1c026a18d4 xen: Remove redundant initialization of irq
The variable irq is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Link: https://lore.kernel.org/r/20200611123134.922395-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-07-24 09:50:22 -05:00
Arvind Sankar 587af649bc x86/build: Move max-page-size option to LDFLAGS_vmlinux
This option is only required for vmlinux on 64-bit, to enforce 2MiB
alignment, so set it in LDFLAGS_vmlinux instead of KBUILD_LDFLAGS. Also
drop the ld-option check: this option was added in binutils-2.18 and all
the other places that use it already don't have the check.

This reduces the size of the intermediate ELF files
arch/x86/boot/setup.elf and arch/x86/realmode/rm/realmode.elf by about
2MiB each. The binary versions are unchanged.

Move the LDFLAGS settings to all be together and just after CFLAGS
settings are done.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Link: https://lore.kernel.org/r/20200722184334.3785418-1-nivedita@alum.mit.edu
2020-07-24 16:39:27 +02:00
Thomas Gleixner 72c3c0fe54 x86/kvm: Use generic xfer to guest work function
Use the generic infrastructure to check for and handle pending work before
transitioning into guest mode.

This now handles TIF_NOTIFY_RESUME as well which was ignored so
far. Handling it is important as this covers task work and task work will
be used to offload the heavy lifting of POSIX CPU timers to thread context.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220520.979724969@linutronix.de
2020-07-24 15:05:01 +02:00
Thomas Gleixner a27a0a5549 x86/entry: Cleanup idtentry_enter/exit
Remove the temporary defines and fixup all references.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.855839271@linutronix.de
2020-07-24 15:05:01 +02:00
Thomas Gleixner bdcd178ada x86/entry: Use generic interrupt entry/exit code
Replace the x86 code with the generic variant. Use temporary defines for
idtentry_* which will be cleaned up in the next step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220520.711492752@linutronix.de
2020-07-24 15:05:00 +02:00
Thomas Gleixner 517e499227 x86/entry: Cleanup idtentry_entry/exit_user
Cleanup the temporary defines and use irqentry_ instead of idtentry_.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.602603691@linutronix.de
2020-07-24 15:05:00 +02:00
Thomas Gleixner 167fd210ec x86/entry: Use generic syscall exit functionality
Replace the x86 variant with the generic version. Provide the relevant
architecture specific helper functions and defines.

Use a temporary define for idtentry_exit_user which will be cleaned up
seperately.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.494648601@linutronix.de
2020-07-24 15:04:59 +02:00
Thomas Gleixner 27d6b4d14f x86/entry: Use generic syscall entry function
Replace the syscall entry work handling with the generic version. Provide
the necessary helper inlines to handle the real architecture specific
parts, e.g. ptrace.

Use a temporary define for idtentry_enter_user which will be cleaned up
seperately.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.376213694@linutronix.de
2020-07-24 15:04:59 +02:00
Thomas Gleixner 0bf019ea59 x86/ptrace: Provide pt_regs helper for entry/exit
As a preparatory step for moving the syscall and interrupt entry/exit
handling into generic code, provide a pt_regs helper which retrieves the
interrupt state from pt_regs. This is required to check whether interrupts
are reenabled by return from interrupt/exception.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.258511584@linutronix.de
2020-07-24 15:04:58 +02:00
Thomas Gleixner a377ac1cd9 x86/entry: Move user return notifier out of loop
Guests and user space share certain MSRs. KVM sets these MSRs to guest
values once and does not set them back to user space values on every VM
exit to spare the costly MSR operations.

User return notifiers ensure that these MSRs are set back to the correct
values before returning to user space in exit_to_usermode_loop().

There is no reason to evaluate the TIF flag indicating that user return
notifiers need to be invoked in the loop. The important point is that they
are invoked before returning to user space.

Move the invocation out of the loop into the section which does the last
preperatory steps before returning to user space. That section is not
preemptible and runs with interrupts disabled until the actual return.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220520.159112003@linutronix.de
2020-07-24 15:04:58 +02:00
Thomas Gleixner 0b085e68f4 x86/entry: Consolidate 32/64 bit syscall entry
64bit and 32bit entry code have the same open coded syscall entry handling
after the bitwidth specific bits.

Move it to a helper function and share the code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220520.051234096@linutronix.de
2020-07-24 15:04:58 +02:00
Thomas Gleixner 8d5ea35c5e x86/entry: Consolidate check_user_regs()
The user register sanity check is sprinkled all over the place. Move it
into enter_from_user_mode().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220519.943016204@linutronix.de
2020-07-24 15:04:57 +02:00
Thomas Gleixner b35ad8405d Merge branch 'core/entry' into x86/entry
Pick up generic entry code to migrate x86 over.
2020-07-24 15:03:59 +02:00
Sedat Dilek 6526b12de0 x86/defconfigs: Remove CONFIG_CRYPTO_AES_586 from i386_defconfig
Initially CONFIG_CRYPTO_AES_586=y was added to the i386_defconfig file in:

  c1b362e3b4d3: ("x86: update defconfigs")

The code and Kconfig for CONFIG_CRYPTO_AES_586 was removed in:

  1d2c3279311e: ("crypto: x86/aes - drop scalar assembler implementations")

Remove the leftover from the i386_defconfig file as well.

Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20200723171119.9881-1-sedat.dilek@gmail.com
2020-07-24 14:55:55 +02:00
Ingo Molnar d19e789f06 compiler.h: Move instrumentation_begin()/end() to new <linux/instrumentation.h> header
Linus pointed out that compiler.h - which is a key header that gets included in every
single one of the 28,000+ kernel files during a kernel build - was bloated in:

  655389666643: ("vmlinux.lds.h: Create section for protection against instrumentation")

Linus noted:

 > I have pulled this, but do we really want to add this to a header file
 > that is _so_ core that it gets included for basically every single
 > file built?
 >
 > I don't even see those instrumentation_begin/end() things used
 > anywhere right now.
 >
 > It seems excessive. That 53 lines is maybe not a lot, but it pushed
 > that header file to over 12kB, and while it's mostly comments, it's
 > extra IO and parsing basically for _every_ single file compiled in the
 > kernel.
 >
 > For what appears to be absolutely zero upside right now, and I really
 > don't see why this should be in such a core header file!

Move these primitives into a new header: <linux/instrumentation.h>, and include that
header in the headers that make use of it.

Unfortunately one of these headers is asm-generic/bug.h, which does get included
in a lot of places, similarly to compiler.h. So the de-bloating effect isn't as
good as we'd like it to be - but at least the interfaces are defined separately.

No change to functionality intended.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200604071921.GA1361070@gmail.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2020-07-24 13:56:23 +02:00
Ira Weiny 7f6fa101df x86: Correct noinstr qualifiers
The noinstr qualifier is to be specified before the return type in the
same way inline is used.

These 2 cases were missed by previous patches.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200723161405.852613-1-ira.weiny@intel.com
2020-07-24 09:54:15 +02:00
Arvind Sankar 0a787b28b7 x86/mm: Drop unused MAX_PHYSADDR_BITS
The macro is not used anywhere, and has an incorrect value (going by the
comment) on x86_64 since commit c898faf91b ("x86: 46 bit physical address
support on 64 bits")

To avoid confusion, just remove the definition.

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200723231544.17274-2-nivedita@alum.mit.edu
2020-07-24 09:53:06 +02:00
Dave Airlie 41206a073c Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge v5.8-rc6 into drm-next

I've got a silent conflict + two trees based on fixes to merge.

Fixes a silent merge with amdgpu

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-07-24 08:48:05 +10:00
Nick Desaulniers 158807de58 x86/uaccess: Make __get_user_size() Clang compliant on 32-bit
Clang fails to compile __get_user_size() on 32-bit for the following code:

      long long val;

      __get_user(val, usrptr);

with: error: invalid output size for constraint '=q'

GCC compiles the same code without complaints.

The reason is that GCC and Clang are architecturally different, which leads
to subtle issues for code that's invalid but clearly dead, i.e. with code
that emulates polymorphism with the preprocessor and sizeof.

GCC will perform semantic analysis after early inlining and dead code
elimination, so it will not warn on invalid code that's dead. Clang
strictly performs optimizations after semantic analysis, so it will warn
for dead code.

Neither Clang nor GCC like this very much with -m32:

long long ret;
asm ("movb $5, %0" : "=q" (ret));

However, GCC can tolerate this variant:

long long ret;
switch (sizeof(ret)) {
case 1:
        asm ("movb $5, %0" : "=q" (ret));
        break;
case 8:;
}

Clang, on the other hand, won't accept that because it validates the inline
asm for the '1' case before the optimisation phase where it realises that
it wouldn't have to emit it anyway.

If LLVM (Clang's "back end") fails such as during instruction selection or
register allocation, it cannot provide accurate diagnostics (warnings /
errors) that contain line information, as the AST has been discarded from
memory at that point.

While there have been early discussions about having C/C++ specific
language optimizations in Clang via the use of MLIR, which would enable
such earlier optimizations, such work is not scoped and likely a multi-year
endeavor.

It was discussed to change the asm output constraint for the one byte case
from "=q" to "=r". While it works for 64-bit, it fails on 32-bit. With '=r'
the compiler could fail to chose a register accessible as high/low which is
required for the byte operation. If that happens the assembly will fail.

Use a local temporary variable of type 'unsigned char' as output for the
byte copy inline asm and then assign it to the real output variable. This
prevents Clang from failing the semantic analysis in the above case.

The resulting code for the actual one byte copy is not affected as the
temporary variable is optimized out.

[ tglx: Amended changelog ]

Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: David Woodhouse <dwmw2@infradead.org>
Reported-by: Dmitry Golovin <dima@golovin.in>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://bugs.llvm.org/show_bug.cgi?id=33587
Link: https://github.com/ClangBuiltLinux/linux/issues/3
Link: https://github.com/ClangBuiltLinux/linux/issues/194
Link: https://github.com/ClangBuiltLinux/linux/issues/781
Link: https://lore.kernel.org/lkml/20180209161833.4605-1-dwmw2@infradead.org/
Link: https://lore.kernel.org/lkml/CAK8P3a1EBaWdbAEzirFDSgHVJMtWjuNt2HGG8z+vpXeNHwETFQ@mail.gmail.com/
Link: https://lkml.kernel.org/r/20200720204925.3654302-12-ndesaulniers@google.com
2020-07-23 12:38:31 +02:00
Brian Gerst 4719ffecbb x86/percpu: Remove unused PER_CPU() macro
Also remove now unused __percpu_mov_op.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-11-ndesaulniers@google.com
2020-07-23 11:46:43 +02:00
Brian Gerst c94055fe93 x86/percpu: Clean up percpu_stable_op()
Use __pcpu_size_call_return() to simplify this_cpu_read_stable().
Also remove __bad_percpu_size() which is now unused.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-10-ndesaulniers@google.com
2020-07-23 11:46:42 +02:00
Brian Gerst ebcd580bed x86/percpu: Clean up percpu_cmpxchg_op()
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.

Also use appropriate types for the width of the instructions.  This avoids
errors when compiling with Clang.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-9-ndesaulniers@google.com
2020-07-23 11:46:42 +02:00
Brian Gerst 73ca542fba x86/percpu: Clean up percpu_xchg_op()
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.

Also use appropriate types for the width of the instructions.  This avoids
errors when compiling with Clang.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-8-ndesaulniers@google.com
2020-07-23 11:46:41 +02:00
Brian Gerst bbff583b84 x86/percpu: Clean up percpu_add_return_op()
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.

Also use appropriate types for the width of the instructions.  This avoids
errors when compiling with Clang.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-7-ndesaulniers@google.com
2020-07-23 11:46:41 +02:00
Brian Gerst e4d16defbb x86/percpu: Remove "e" constraint from XADD
The "e" constraint represents a constant, but the XADD instruction doesn't
accept immediate operands.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-6-ndesaulniers@google.com
2020-07-23 11:46:40 +02:00
Brian Gerst 33e5614a43 x86/percpu: Clean up percpu_add_op()
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.

Also use appropriate types for the width of the instructions.  This avoids
errors when compiling with Clang.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-5-ndesaulniers@google.com
2020-07-23 11:46:40 +02:00
Brian Gerst bb631e3002 x86/percpu: Clean up percpu_from_op()
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.

Also use appropriate types for the width of the instructions.  This avoids
errors when compiling with Clang.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-4-ndesaulniers@google.com
2020-07-23 11:46:39 +02:00
Brian Gerst c175acc147 x86/percpu: Clean up percpu_to_op()
The core percpu macros already have a switch on the data size, so the switch
in the x86 code is redundant and produces more dead code.

Also use appropriate types for the width of the instructions.  This avoids
errors when compiling with Clang.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-3-ndesaulniers@google.com
2020-07-23 11:46:39 +02:00
Brian Gerst 6865dc3ae9 x86/percpu: Introduce size abstraction macros
In preparation for cleaning up the percpu operations, define macros for
abstraction based on the width of the operation.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Link: https://lkml.kernel.org/r/20200720204925.3654302-2-ndesaulniers@google.com
2020-07-23 11:46:39 +02:00
Uros Bizjak ef19f826ec crypto: x86 - Put back integer parts of include/asm/inst.h
Resolves conflict with the tip tree.

Fixes: d7866e503b ("crypto: x86 - Remove include/asm/inst.h")
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Stephen Rothwell <sfr@canb.auug.org.au>,
CC: "Chang S. Bae" <chang.seok.bae@intel.com>,
CC: Peter Zijlstra <peterz@infradead.org>,
CC: Sasha Levin <sashal@kernel.org>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-23 17:34:20 +10:00
Arnd Bergmann 44623b2818 crypto: x86/crc32c - fix building with clang ias
The clang integrated assembler complains about movzxw:

arch/x86/crypto/crc32c-pcl-intel-asm_64.S:173:2: error: invalid instruction mnemonic 'movzxw'

It seems that movzwq is the mnemonic that it expects instead,
and this is what objdump prints when disassembling the file.

Fixes: 6a8ce1ef39 ("crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-23 17:34:16 +10:00
Jon Derrick ec0160891e irqdomain/treewide: Free firmware node after domain removal
Commit 711419e504 ("irqdomain: Add the missing assignment of
domain->fwnode for named fwnode") unintentionally caused a dangling pointer
page fault issue on firmware nodes that were freed after IRQ domain
allocation. Commit e3beca48a4 fixed that dangling pointer issue by only
freeing the firmware node after an IRQ domain allocation failure. That fix
no longer frees the firmware node immediately, but leaves the firmware node
allocated after the domain is removed.

The firmware node must be kept around through irq_domain_remove, but should be
freed it afterwards.

Add the missing free operations after domain removal where where appropriate.

Fixes: e3beca48a4 ("irqdomain/treewide: Keep firmware node unconditionally allocated")
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>	# drivers/pci
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1595363169-7157-1-git-send-email-jonathan.derrick@intel.com
2020-07-23 00:08:52 +02:00
Dmitry Safonov ef2ff0f5d6 x86/dumpstack: Show registers dump with trace's log level
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.

Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).

Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).

After all preparations are done, provide log_lvl parameter for
show_regs_if_on_stack() and wire up to actual log level used as
an argument for show_trace_log_lvl().

[1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/
[2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://lkml.kernel.org/r/20200629144847.492794-4-dima@arista.com
2020-07-22 23:56:54 +02:00
Dmitry Safonov 44e215352c x86/dumpstack: Add log_lvl to __show_regs()
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.

Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).

Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).

Add log_lvl parameter to __show_regs().
Keep the used log level intact to separate visible change.

[1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/
[2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://lkml.kernel.org/r/20200629144847.492794-3-dima@arista.com
2020-07-22 23:56:53 +02:00
Dmitry Safonov fd07f802a7 x86/dumpstack: Add log_lvl to show_iret_regs()
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.

Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).

Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).

Add log_lvl parameter to show_iret_regs() as a preparation to add it
to __show_regs() and show_regs_if_on_stack().

[1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/
[2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/

Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Petr Mladek <pmladek@suse.com>
Link: https://lkml.kernel.org/r/20200629144847.492794-2-dima@arista.com
2020-07-22 23:56:53 +02:00
Thomas Gleixner d181d2da01 x86/dumpstack: Dump user space code correctly again
H.J. reported that post 5.7 a segfault of a user space task does not longer
dump the Code bytes when /proc/sys/debug/exception-trace is enabled. It
prints 'Code: Bad RIP value.' instead.

This was broken by a recent change which made probe_kernel_read() reject
non-kernel addresses.

Update show_opcodes() so it retrieves user space opcodes via
copy_from_user_nmi().

Fixes: 98a23609b1 ("maccess: always use strict semantics for probe_kernel_read")
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/87h7tz306w.fsf@nanos.tec.linutronix.de
2020-07-22 23:47:48 +02:00
Josh Poimboeuf 039a7a30ec x86/stacktrace: Fix reliable check for empty user task stacks
If a user task's stack is empty, or if it only has user regs, ORC
reports it as a reliable empty stack.  But arch_stack_walk_reliable()
incorrectly treats it as unreliable.

That happens because the only success path for user tasks is inside the
loop, which only iterates on non-empty stacks.  Generally, a user task
must end in a user regs frame, but an empty stack is an exception to
that rule.

Thanks to commit 71c9582528 ("x86/unwind/orc: Fix error handling in
__unwind_start()"), unwind_start() now sets state->error appropriately.
So now for both ORC and FP unwinders, unwind_done() and !unwind_error()
always means the end of the stack was successfully reached.  So the
success path for kthreads is no longer needed -- it can also be used for
empty user tasks.

Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lkml.kernel.org/r/f136a4e5f019219cbc4f4da33b30c2f44fa65b84.1594994374.git.jpoimboe@redhat.com
2020-07-22 23:47:47 +02:00
Josh Poimboeuf 372a8eaa05 x86/unwind/orc: Fix ORC for newly forked tasks
The ORC unwinder fails to unwind newly forked tasks which haven't yet
run on the CPU.  It correctly reads the 'ret_from_fork' instruction
pointer from the stack, but it incorrectly interprets that value as a
call stack address rather than a "signal" one, so the address gets
incorrectly decremented in the call to orc_find(), resulting in bad ORC
data.

Fix it by forcing 'ret_from_fork' frames to be signal frames.

Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lkml.kernel.org/r/f91a8778dde8aae7f71884b5df2b16d552040441.1594994374.git.jpoimboe@redhat.com
2020-07-22 23:47:47 +02:00
Linus Torvalds d15be54603 media fixes for v5.8-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAl8VlckACgkQCF8+vY7k
 4RWg9RAAhmGk/mYYAIAy6o2G8CPgcNnLrUuhXD+SyYrM25QlkinMjHXVhWiPo0F8
 DKq21vXg0qlOe7kwn0vSEQ7tClAuJhK/ERnLnDFWe36lO6QF5ydllzEg9XS+xVq2
 4Ur+jdkllaS8j7sjQrEdQF17HW52Sr0WB8KegoeUXfeKcGaIBmB+kfNstK2fWMa+
 yuVITK/YNpr53zwwsHlMvmIndui0DJHKGGWuTknZsoQfWJtO62YW50FmkizjqXIt
 +4DCRiq9Juc/MPYotP2pArwBFiqSl/uP0MeWqn4YzPXvhhhoGPTAhskBWDmZD8jT
 XI9Q9Uj5x5kCtTLx4CI3mG2X2XrZ+GqFKZhVD0WKnZN2skdHRn/3qn6tHm7O/Rtr
 s27MVKae5zSQs1XrCXwlEz03T5wZ1y1FCX3dlXGkQqbOQLQGbZTsE8wybPElzkLR
 18e8xTA1Ss7tUYf7qcrEmMk242byXP8F43cStaWCwsbJgddBpV2jFFfxan3/j6q9
 Kl6Y/ssRn6OrIy1UNpdCClRweLi0U53sFQqM/ZmyUbBXMbT/y+rL6h8Wel8EvFlR
 PzisYBI/G0AWHahbdHhS9mzQFAZb9etlAboMvI31bo4Bjpbv9UVkX5dMLxabcbEr
 B70EZ7nigg5EMqeBkemz0kD4W68qvQRKHK0d+TYcYSnq0VMmErI=
 =/5rQ
 -----END PGP SIGNATURE-----

Merge tag 'media/v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media into master

Pull media fixes from Mauro Carvalho Chehab:
 "A series of fixes for the upcoming atomisp driver. They solve issues
  when probing atomisp on devices with multiple cameras and get rid of
  warnings when built with W=1.

  The diffstat is a bit long, as this driver has several abstractions.
  The patches that solved the issues with W=1 had to get rid of some
  duplicated code (there used to have 2 versions of the same code, one
  for ISP2401 and another one for ISP2400).

  As this driver is not in 5.7, such changes won't cause regressions"

* tag 'media/v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (38 commits)
  Revert "media: atomisp: keep the ISP powered on when setting it"
  media: atomisp: fix mask and shift operation on ISPSSPM0
  media: atomisp: move system_local consts into a C file
  media: atomisp: get rid of version-specific system_local.h
  media: atomisp: move global stuff into a common header
  media: atomisp: remove non-used 32-bits consts at system_local
  media: atomisp: get rid of some unused static vars
  media: atomisp: Fix error code in ov5693_probe()
  media: atomisp: Replace trace_printk by pr_info
  media: atomisp: Fix __func__ style warnings
  media: atomisp: fix help message for ISP2401 selection
  media: atomisp: i2c: atomisp-ov2680.c: fixed a brace coding style issue.
  media: atomisp: make const arrays static, makes object smaller
  media: atomisp: Clean up non-existing folders from Makefile
  media: atomisp: Get rid of ACPI specifics in gmin_subdev_add()
  media: atomisp: Provide Gmin subdev as parameter to gmin_subdev_add()
  media: atomisp: Use temporary variable for device in gmin_subdev_add()
  media: atomisp: Refactor PMIC detection to a separate function
  media: atomisp: Deduplicate return ret in gmin_i2c_write()
  media: atomisp: Make pointer to PMIC client global
  ...
2020-07-22 11:56:00 -07:00
Hu Haowen 2ac5413e5e x86/perf: Fix a typo
The word "Zhoaxin" is incorrect and the right one is "Zhaoxin".

Signed-off-by: Hu Haowen <xianfengting221@163.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200719105007.57649-1-xianfengting221@163.com
2020-07-22 10:22:08 +02:00
Peter Zijlstra 015dc08918 Merge branch 'sched/urgent' 2020-07-22 10:22:02 +02:00
Joerg Roedel de2b41be8f x86, vmlinux.lds: Page-align end of ..page_aligned sections
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
page-aligned, but the end of the .bss..page_aligned section is not
guaranteed to be page-aligned.

As a result, objects from other .bss sections may end up on the same 4k
page as the idt_table, and will accidentially get mapped read-only during
boot, causing unexpected page-faults when the kernel writes to them.

This could be worked around by making the objects in the page aligned
sections page sized, but that's wrong.

Explicit sections which store only page aligned objects have an implicit
guarantee that the object is alone in the page in which it is placed. That
works for all objects except the last one. That's inconsistent.

Enforcing page sized objects for these sections would wreckage memory
sanitizers, because the object becomes artificially larger than it should
be and out of bound access becomes legit.

Align the end of the .bss..page_aligned and .data..page_aligned section on
page-size so all objects places in these sections are guaranteed to have
their own page.

[ tglx: Amended changelog ]

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
2020-07-22 09:38:37 +02:00
Eric W. Biederman be619f7f06 exec: Implement kernel_execve
To allow the kernel not to play games with set_fs to call exec
implement kernel_execve.  The function kernel_execve takes pointers
into kernel memory and copies the values pointed to onto the new
userspace stack.

The calls with arguments from kernel space of do_execve are replaced
with calls to kernel_execve.

The calls do_execve and do_execveat are made static as there are now
no callers outside of exec.

The comments that mention do_execve are updated to refer to
kernel_execve or execve depending on the circumstances.  In addition
to correcting the comments, this makes it easy to grep for do_execve
and verify it is not used.

Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-hch@lst.de
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/87wo365ikj.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-07-21 08:24:52 -05:00
Christoph Hellwig 55db9c0e85 net: remove compat_sys_{get,set}sockopt
Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.

This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.

It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:16:40 -07:00
Kevin Buettner 5714ee50bb copy_xstate_to_kernel: Fix typo which caused GDB regression
This fixes a regression encountered while running the
gdb.base/corefile.exp test in GDB's test suite.

In my testing, the typo prevented the sw_reserved field of struct
fxregs_state from being output to the kernel XSAVES area.  Thus the
correct mask corresponding to XCR0 was not present in the core file for
GDB to interrogate, resulting in the following behavior:

   [kev@f32-1 gdb]$ ./gdb -q testsuite/outputs/gdb.base/corefile/corefile testsuite/outputs/gdb.base/corefile/corefile.core
   Reading symbols from testsuite/outputs/gdb.base/corefile/corefile...
   [New LWP 232880]

   warning: Unexpected size of section `.reg-xstate/232880' in core file.

With the typo fixed, the test works again as expected.

Signed-off-by: Kevin Buettner <kevinb@redhat.com>
Fixes: 9e46365459 ("copy_xstate_to_kernel(): don't leave parts of destination uninitialized")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-19 17:09:10 -07:00
Linus Torvalds efb9666e90 A pile of fixes for x86:
- Fix the I/O bitmap invalidation on XEN PV, which was overlooked in the
    recent ioperm/iopl rework. This caused the TSS and XEN's I/O bitmap to
    get out of sync.
 
  - Use the proper vectors for HYPERV.
 
  - Make disabling of stack protector for the entry code work with GCC
    builds which enable stack protector by default. Removing the option is
    not sufficient, it needs an explicit -fno-stack-protector to shut it
    off.
 
  - Mark check_user_regs() noinstr as it is called from noinstr code. The
    missing annotation causes it to be placed in the text section which
    makes it instrumentable.
 
  - Add the missing interrupt disable in exc_alignment_check()
 
  - Fixup a XEN_PV build dependency in the 32bit entry code
 
  - A few fixes to make the Clang integrated assembler happy
 
  - Move EFI stub build to the right place for out of tree builds
 
  - Make prepare_exit_to_usermode() static. It's not longer called from ASM
    code.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8UR+MTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoQCUD/4/9W5FFvdZvQPwmXsHPaVnW9hUsXxG
 0tjc34xqDcgEl1U3khu+6jj+oHx+JM+4wGP/V49Wqx6xkrJ33/a8uYErAgI7+Pmp
 s3T2gXMWkgJtYFlDQdAMbeuuM2cOFZJw4BxxvTMth5EixQvk1EkX6QyBjLaSGo8y
 78sWtZ6Oh5Ql9ua/9TOilewLsCsQSFIFn0o/hawwwPUMrwGvD29scha0XHom+AO7
 uwejfU8klq2HJJaLaaiUaiNBkFz0TNGJtY+3mQpw8BPjCuuBQhYygrS0X4uQzo01
 4XJzhDnOVbAYWqi0/T+mAEmuJ9NBZJwYiYrwBYCkZgELwJKLzhzO2GOgP9xEsFY4
 VUNgqHFhKrQp10k2k4L/A5tmr+0GntiCQhdZi+/gty6oO/t3ni57pRcAhA9qBNOb
 8ZqumBwgaaAIqcmdtoyXAIveWOHnzwKEg6wmIGFbyCwHjeLJKJG7KhpXIpEuX+j2
 DC7EfYvRB+jllAk1CBypBvzD0DHfMZ0myPxCcZiW2wHTVAlkpY7hiIyPHqocjE9L
 OjOQ7FS6E2/p24lYVcLUFWcESxGFvQjjxwXk7htjpGUIZsQOhz/LOW+CIPCsfbqE
 HoEsHmNltksYYV9FDfevXRp5sbxpx3wQSLOgqNqiOpy4cTCG8boalUqHQ0OsN8Oa
 EgU067yF77ymRg==
 =QAeH
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master

Pull x86 fixes from Thomas Gleixner:
 "A pile of fixes for x86:

   - Fix the I/O bitmap invalidation on XEN PV, which was overlooked in
     the recent ioperm/iopl rework. This caused the TSS and XEN's I/O
     bitmap to get out of sync.

   - Use the proper vectors for HYPERV.

   - Make disabling of stack protector for the entry code work with GCC
     builds which enable stack protector by default. Removing the option
     is not sufficient, it needs an explicit -fno-stack-protector to
     shut it off.

   - Mark check_user_regs() noinstr as it is called from noinstr code.
     The missing annotation causes it to be placed in the text section
     which makes it instrumentable.

   - Add the missing interrupt disable in exc_alignment_check()

   - Fixup a XEN_PV build dependency in the 32bit entry code

   - A few fixes to make the Clang integrated assembler happy

   - Move EFI stub build to the right place for out of tree builds

   - Make prepare_exit_to_usermode() static. It's not longer called from
     ASM code"

* tag 'x86-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Don't add the EFI stub to targets
  x86/entry: Actually disable stack protector
  x86/ioperm: Fix io bitmap invalidation on Xen PV
  x86: math-emu: Fix up 'cmp' insn for clang ias
  x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERV
  x86/entry: Add compatibility with IAS
  x86/entry/common: Make prepare_exit_to_usermode() static
  x86/entry: Mark check_user_regs() noinstr
  x86/traps: Disable interrupts in exc_aligment_check()
  x86/entry/32: Fix XEN_PV build dependency
2020-07-19 12:16:09 -07:00
Linus Torvalds 9413cd7792 Two fixes for the interrupt subsystem:
- Make the handling of the firmware node consistent and do not free the
    node after the domain has been created successfully. The core code
    stores a pointer to it which can lead to a use after free or double
    free.
 
    This used to "work" because the pointer was not stored when the initial
    code was written, but at some point later it was required to store
    it. Of course nobody noticed that the existing users break that way.
 
  - Handle affinity setting on inactive interrupts correctly when
    hierarchical irq domains are enabled. When interrupts are inactive with
    the modern hierarchical irqdomain design, the interrupt chips are not
    necessarily in a state where affinity changes can be handled. The legacy
    irq chip design allowed this because interrupts are immediately fully
    initialized at allocation time. X86 has a hacky workaround for this, but
    other implementations do not. This cased malfunction on GIC-V3. Instead
    of playing whack a mole to find all affected drivers, change the core
    code to store the requested affinity setting and then establish it when
    the interrupt is allocated, which makes the X86 hack go away.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8UP+4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoSuZD/9tNPR4fIDt4mC9ciSvwSqGTV+q1y1D
 zhXSDro4cJNjzy/9D475IJqOlvchaF9Nfun55b60Q6vnA4VN8G+kABEaG8uwr8mV
 ijTB4f0qKfW/9kUDTJRScq3nNmC3miqg8ZFgFEn6Ecxj3NHmwidATIi5sF6f/XVG
 DdhL0Jys7ycNeGBf7yIKbT5/NOULMHYy9rK1NDAeBo9u3klvmrwrHgdNsiDDhEaU
 KlHtwuQLCdjFY3Lf67YpSah+Hx/gXPI1VHUxDDFRoFmC4RlB0VjyXGydjsisOrSQ
 Cl2gnkQ6VOlLaLbN38nmia9nyb6npzE5iK1h9EDcaRhBACG9O23Bdo+YZYxl6BOP
 mXuyIVKJYczJEp7j1fGHW/aNCoEqC8dGVyN7toxMVfGZmF12JzMSt4SYItPeSjFC
 bPNPRCscpiMOQdgwgO0woK1764V46g1BlmxXtJRdWB4iwWgXcryaz65xzSfNeZF4
 0+TvdYs2FYjxwwIyWj8xJ3Npe1lKhH+06DA6gziwJt1u4it8rl82UcqMFyf/ty1w
 o5LHyMBWYm7SJXSeaZZj+nv7moJKJnmRYKnpry21cUzsK/vQEPX0vqhwh4dSFN3O
 BaBocDsOk+9wkmUwi6haP+6+vpadAFQrsqVhURtwc6OVSWn2/vsf2ZH5P36xwFWD
 tlFanb8hX9y2NQ==
 =elM3
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master

Pull irq fixes from Thomas Gleixner:
 "Two fixes for the interrupt subsystem:

   - Make the handling of the firmware node consistent and do not free
     the node after the domain has been created successfully. The core
     code stores a pointer to it which can lead to a use after free or
     double free.

     This used to "work" because the pointer was not stored when the
     initial code was written, but at some point later it was required
     to store it. Of course nobody noticed that the existing users break
     that way.

   - Handle affinity setting on inactive interrupts correctly when
     hierarchical irq domains are enabled.

     When interrupts are inactive with the modern hierarchical irqdomain
     design, the interrupt chips are not necessarily in a state where
     affinity changes can be handled. The legacy irq chip design allowed
     this because interrupts are immediately fully initialized at
     allocation time. X86 has a hacky workaround for this, but other
     implementations do not.

     This cased malfunction on GIC-V3. Instead of playing whack a mole
     to find all affected drivers, change the core code to store the
     requested affinity setting and then establish it when the interrupt
     is allocated, which makes the X86 hack go away"

* tag 'irq-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/affinity: Handle affinity setting on inactive interrupts correctly
  irqdomain/treewide: Keep firmware node unconditionally allocated
2020-07-19 11:53:08 -07:00
Arvind Sankar da05b143a3 x86/boot: Don't add the EFI stub to targets
vmlinux-objs-y is added to targets, which currently means that the EFI
stub gets added to the targets as well. It shouldn't be added since it
is built elsewhere.

This confuses Makefile.build which interprets the EFI stub as a target
	$(obj)/$(objtree)/drivers/firmware/efi/libstub/lib.a
and will create drivers/firmware/efi/libstub/ underneath
arch/x86/boot/compressed, to hold this supposed target, if building
out-of-tree. [0]

Fix this by pulling the stub out of vmlinux-objs-y into efi-obj-y.

[0] See scripts/Makefile.build near the end:
    # Create directories for object files if they do not exist

Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200715032631.1562882-1-nivedita@alum.mit.edu
2020-07-19 13:07:11 +02:00
Kees Cook 58ac3154b8 x86/entry: Actually disable stack protector
Some builds of GCC enable stack protector by default. Simply removing
the arguments is not sufficient to disable stack protector, as the stack
protector for those GCC builds must be explicitly disabled. Remove the
argument removals and add -fno-stack-protector. Additionally include
missed x32 argument updates, and adjust whitespace for readability.

Fixes: 20355e5f73 ("x86/entry: Exclude low level entry code from sanitizing")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/202006261333.585319CA6B@keescook
2020-07-19 13:07:10 +02:00
Christoph Hellwig 2f9237d4f6 dma-mapping: make support for dma ops optional
Avoid the overhead of the dma ops support for tiny builds that only
use the direct mapping.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2020-07-19 09:29:23 +02:00
Andy Lutomirski cadfad8701 x86/ioperm: Fix io bitmap invalidation on Xen PV
tss_invalidate_io_bitmap() wasn't wired up properly through the pvop
machinery, so the TSS and Xen's io bitmap would get out of sync
whenever disabling a valid io bitmap.

Add a new pvop for tss_invalidate_io_bitmap() to fix it.

This is XSA-329.

Fixes: 22fe5b0439 ("x86/ioperm: Move TSS bitmap update to exit to user work")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/d53075590e1f91c19f8af705059d3ff99424c020.1595030016.git.luto@kernel.org
2020-07-18 12:31:49 +02:00
Andy Shevchenko 5f55dd5499 media: atomisp: move CCK endpoint address to generic header
IOSF MBI header contains a lot of definitions, such as
end point addresses of IPs. Move CCK address from AtomISP driver
to generic header.

While here, drop unused one.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-07-18 07:17:16 +02:00
Thomas Gleixner baedb87d1b genirq/affinity: Handle affinity setting on inactive interrupts correctly
Setting interrupt affinity on inactive interrupts is inconsistent when
hierarchical irq domains are enabled. The core code should just store the
affinity and not call into the irq chip driver for inactive interrupts
because the chip drivers may not be in a state to handle such requests.

X86 has a hacky workaround for that but all other irq chips have not which
causes problems e.g. on GIC V3 ITS.

Instead of adding more ugly hacks all over the place, solve the problem in
the core code. If the affinity is set on an inactive interrupt then:

    - Store it in the irq descriptors affinity mask
    - Update the effective affinity to reflect that so user space has
      a consistent view
    - Don't call into the irq chip driver

This is the core equivalent of the X86 workaround and works correctly
because the affinity setting is established in the irq chip when the
interrupt is activated later on.

Note, that this is only effective when hierarchical irq domains are enabled
by the architecture. Doing it unconditionally would break legacy irq chip
implementations.

For hierarchial irq domains this works correctly as none of the drivers can
have a dependency on affinity setting in inactive state by design.

Remove the X86 workaround as it is not longer required.

Fixes: 02edee152d ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
Reported-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com
Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
2020-07-17 23:30:43 +02:00
steve.wahl@hpe.com 3bcf25a40b x86/efi: Remove unused EFI_UV1_MEMMAP code
With UV1 support removed, EFI_UV1_MEMMAP is no longer used.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200713212956.019149227@hpe.com
2020-07-17 16:47:48 +02:00
steve.wahl@hpe.com 6aa3baabe1 x86/platform/uv: Remove uv bios and efi code related to EFI_UV1_MEMMAP
With UV1 removed, EFI_UV1_MEMMAP is not longer used.  Remove the code used
by it and the related code in EFI.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200713212955.902592618@hpe.com
2020-07-17 16:47:48 +02:00
steve.wahl@hpe.com 66d67fecd8 x86/efi: Remove references to no-longer-used efi_have_uv1_memmap()
In removing UV1 support, efi_have_uv1_memmap is no longer used.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200713212955.786177105@hpe.com
2020-07-17 16:47:47 +02:00
steve.wahl@hpe.com cadde2379f x86/efi: Delete SGI UV1 detection.
As a part of UV1 platform removal, don't try to recognize the platform
through DMI to set the EFI_UV1_MEMMAP bit.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200713212955.667726896@hpe.com
2020-07-17 16:47:47 +02:00
steve.wahl@hpe.com 3dad716240 x86/platform/uv: Remove efi=old_map command line option
As a part of UV1 platform removal, delete the efi=old_map option,
which is not longer needed.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200713212955.552098718@hpe.com
2020-07-17 16:47:46 +02:00
steve.wahl@hpe.com 5d66253751 x86/platform/uv: Remove vestigial mention of UV1 platform from bios header
Remove UV1 reference as UV1 is not longer supported by HPE.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200713212955.435951508@hpe.com
2020-07-17 16:47:46 +02:00
steve.wahl@hpe.com f584c75307 x86/platform/uv: Remove support for UV1 platform from uv
UV1 is not longer supported by HPE

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200713212955.320087418@hpe.com
2020-07-17 16:47:46 +02:00
steve.wahl@hpe.com 9b9ee17241 x86/platform/uv: Remove support for uv1 platform from uv_hub
UV1 is not longer supported by HPE.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200713212955.203480177@hpe.com
2020-07-17 16:47:45 +02:00
steve.wahl@hpe.com 711621a098 x86/platform/uv: Remove support for UV1 platform from uv_bau
UV1 is not longer supported.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200713212955.083309377@hpe.com
2020-07-17 16:47:45 +02:00
steve.wahl@hpe.com 3736e82d3a x86/platform/uv: Remove support for UV1 platform from uv_mmrs
UV1 is not longer supported.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200713212954.964332370@hpe.com
2020-07-17 16:47:44 +02:00
steve.wahl@hpe.com a6b740f173 x86/platform/uv: Remove support for UV1 platform from x2apic_uv_x
UV1 is not longer supported.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200713212954.846026992@hpe.com
2020-07-17 16:47:44 +02:00
steve.wahl@hpe.com 95328de5fc x86/platform/uv: Remove support for UV1 platform from uv_tlb
UV1 is not longer supported.

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200713212954.728022415@hpe.com
2020-07-17 16:47:44 +02:00
steve.wahl@hpe.com 8b3c9b1606 x86/platform/uv: Remove support for UV1 platform from uv_time
UV1 is not longer supported

Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200713212954.610885520@hpe.com
2020-07-17 16:47:43 +02:00
Kees Cook 3f649ab728 treewide: Remove uninitialized_var() usage
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
	xargs perl -pi -e \
		's/\buninitialized_var\(([^\)]+)\)/\1/g;
		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-16 12:35:15 -07:00
Kees Cook aecfd220b2 x86/mm/numa: Remove uninitialized_var() usage
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings (e.g.
"unused variable"). If the compiler thinks it is uninitialized, either
simply initialize the variable or make compiler changes. As a precursor
to removing[2] this[3] macro[4], refactor code to avoid its need.

The original reason for its use here was to work around the #ifdef
being the only place the variable was used. This is better expressed
using IS_ENABLED() and a new code block where the variable can be used
unconditionally.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Fixes: 1e01979c8f ("x86, numa: Implement pfn -> nid mapping granularity check")
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-16 12:32:25 -07:00
Arnd Bergmann 81e96851ea x86: math-emu: Fix up 'cmp' insn for clang ias
The clang integrated assembler requires the 'cmp' instruction to
have a length prefix here:

arch/x86/math-emu/wm_sqrt.S:212:2: error: ambiguous instructions require an explicit suffix (could be 'cmpb', 'cmpw', or 'cmpl')
 cmp $0xffffffff,-24(%ebp)
 ^

Make this a 32-bit comparison, which it was clearly meant to be.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20200527135352.1198078-1-arnd@arndb.de
2020-07-16 17:26:42 +02:00
Sedat Dilek 5769fe26f3 x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERV
When assembling with Clang via `make LLVM_IAS=1` and CONFIG_HYPERV enabled,
we observe the following error:

<instantiation>:9:6: error: expected absolute expression
 .if HYPERVISOR_REENLIGHTENMENT_VECTOR == 3
     ^
<instantiation>:1:1: note: while in macro instantiation
idtentry HYPERVISOR_REENLIGHTENMENT_VECTOR asm_sysvec_hyperv_reenlightenment sysvec_hyperv_reenlightenment has_error_code=0
^
./arch/x86/include/asm/idtentry.h:627:1: note: while in macro instantiation
idtentry_sysvec HYPERVISOR_REENLIGHTENMENT_VECTOR sysvec_hyperv_reenlightenment;
^
<instantiation>:9:6: error: expected absolute expression
 .if HYPERVISOR_STIMER0_VECTOR == 3
     ^
<instantiation>:1:1: note: while in macro instantiation
idtentry HYPERVISOR_STIMER0_VECTOR asm_sysvec_hyperv_stimer0 sysvec_hyperv_stimer0 has_error_code=0
^
./arch/x86/include/asm/idtentry.h:628:1: note: while in macro instantiation
idtentry_sysvec HYPERVISOR_STIMER0_VECTOR sysvec_hyperv_stimer0;

This is caused by typos in arch/x86/include/asm/idtentry.h:

HYPERVISOR_REENLIGHTENMENT_VECTOR -> HYPERV_REENLIGHTENMENT_VECTOR
HYPERVISOR_STIMER0_VECTOR         -> HYPERV_STIMER0_VECTOR

For more details see ClangBuiltLinux issue #1088.

Fixes: a16be368dd ("x86/entry: Convert various hypervisor vectors to IDTENTRY_SYSVEC")
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1088
Link: https://github.com/ClangBuiltLinux/linux/issues/1043
Link: https://lore.kernel.org/patchwork/patch/1272115/
Link: https://lkml.kernel.org/r/20200714194740.4548-1-sedat.dilek@gmail.com
2020-07-16 17:25:10 +02:00
Jian Cai 6ee93f8df0 x86/entry: Add compatibility with IAS
Clang's integrated assembler does not allow symbols with non-absolute
values to be reassigned. Modify the interrupt entry loop macro to be
compatible with IAS by using a label and an offset.

Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Jian Cai <caij2003@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> #
Link: https://github.com/ClangBuiltLinux/linux/issues/1043
Link: https://lkml.kernel.org/r/20200714233024.1789985-1-caij2003@gmail.com
2020-07-16 17:25:09 +02:00
Uros Bizjak d7866e503b crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.

Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.

The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-16 21:49:07 +10:00
Ard Biesheuvel e79a317151 crypto: x86/chacha-sse3 - use unaligned loads for state array
Due to the fact that the x86 port does not support allocating objects
on the stack with an alignment that exceeds 8 bytes, we have a rather
ugly hack in the x86 code for ChaCha to ensure that the state array is
aligned to 16 bytes, allowing the SSE3 implementation of the algorithm
to use aligned loads.

Given that the performance benefit of using of aligned loads appears to
be limited (~0.25% for 1k blocks using tcrypt on a Corei7-8650U), and
the fact that this hack has leaked into generic ChaCha code, let's just
remove it.

Cc: Martin Willi <martin@strongswan.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin Willi <martin@strongswan.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-16 21:49:04 +10:00
Thomas Gleixner 790ce3b400 x86/idtentry: Remove stale comment
Stack switching for interrupt handlers happens in C now for both 64 and
32bit. Remove the stale comment which claims the contrary.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2020-07-16 12:44:26 +02:00
Thomas Gleixner e3beca48a4 irqdomain/treewide: Keep firmware node unconditionally allocated
Quite some non OF/ACPI users of irqdomains allocate firmware nodes of type
IRQCHIP_FWNODE_NAMED or IRQCHIP_FWNODE_NAMED_ID and free them right after
creating the irqdomain. The only purpose of these FW nodes is to convey
name information. When this was introduced the core code did not store the
pointer to the node in the irqdomain. A recent change stored the firmware
node pointer in irqdomain for other reasons and missed to notice that the
usage sites which do the alloc_fwnode/create_domain/free_fwnode sequence
are broken by this. Storing a dangling pointer is dangerous itself, but in
case that the domain is destroyed later on this leads to a double free.

Remove the freeing of the firmware node after creating the irqdomain from
all affected call sites to cure this.

Fixes: 711419e504 ("irqdomain: Add the missing assignment of domain->fwnode for named fwnode")
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/873661qakd.fsf@nanos.tec.linutronix.de
2020-07-14 17:44:42 +02:00
Paolo Bonzini e8af9e9f45 KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF
The "if" that drops the present bit from the page structure fauls makes no sense.
It was added by yours truly in order to be bug-compatible with pre-existing code
and in order to make the tests pass; however, the tests are wrong.  The behavior
after this patch matches bare metal.

Reported-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 17:01:54 -04:00
Mohammed Gamal 3edd68399d KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
This patch adds a new capability KVM_CAP_SMALLER_MAXPHYADDR which
allows userspace to query if the underlying architecture would
support GUEST_MAXPHYADDR < HOST_MAXPHYADDR and hence act accordingly
(e.g. qemu can decide if it should warn for -cpu ..,phys-bits=X)

The complications in this patch are due to unexpected (but documented)
behaviour we see with NPF vmexit handling in AMD processor.  If
SVM is modified to add guest physical address checks in the NPF
and guest #PF paths, we see the followning error multiple times in
the 'access' test in kvm-unit-tests:

            test pte.p pte.36 pde.p: FAIL: pte 2000021 expected 2000001
            Dump mapping: address: 0x123400000000
            ------L4: 24c3027
            ------L3: 24c4027
            ------L2: 24c5021
            ------L1: 1002000021

This is because the PTE's accessed bit is set by the CPU hardware before
the NPF vmexit. This is handled completely by hardware and cannot be fixed
in software.

Therefore, availability of the new capability depends on a boolean variable
allow_smaller_maxphyaddr which is set individually by VMX and SVM init
routines. On VMX it's always set to true, on SVM it's only set to true
when NPT is not enabled.

CC: Tom Lendacky <thomas.lendacky@amd.com>
CC: Babu Moger <babu.moger@amd.com>
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Message-Id: <20200710154811.418214-10-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 17:01:53 -04:00
Paolo Bonzini 8c4182bd27 KVM: VMX: optimize #PF injection when MAXPHYADDR does not match
Ignore non-present page faults, since those cannot have reserved
bits set.

When running access.flat with "-cpu Haswell,phys-bits=36", the
number of trapped page faults goes down from 8872644 to 3978948.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200710154811.418214-9-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 17:01:52 -04:00
Mohammed Gamal 1dbf5d68af KVM: VMX: Add guest physical address check in EPT violation and misconfig
Check guest physical address against its maximum, which depends on the
guest MAXPHYADDR. If the guest's physical address exceeds the
maximum (i.e. has reserved bits set), inject a guest page fault with
PFERR_RSVD_MASK set.

This has to be done both in the EPT violation and page fault paths, as
there are complications in both cases with respect to the computation
of the correct error code.

For EPT violations, unfortunately the only possibility is to emulate,
because the access type in the exit qualification might refer to an
access to a paging structure, rather than to the access performed by
the program.

Trapping page faults instead is needed in order to correct the error code,
but the access type can be obtained from the original error code and
passed to gva_to_gpa.  The corrections required in the error code are
subtle. For example, imagine that a PTE for a supervisor page has a reserved
bit set.  On a supervisor-mode access, the EPT violation path would trigger.
However, on a user-mode access, the processor will not notice the reserved
bit and not include PFERR_RSVD_MASK in the error code.

Co-developed-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200710154811.418214-8-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 17:01:52 -04:00
Paolo Bonzini a0c134347b KVM: VMX: introduce vmx_need_pf_intercept
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200710154811.418214-7-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 17:01:52 -04:00
Paolo Bonzini 32de2b5ee3 KVM: x86: update exception bitmap on CPUID changes
Allow vendor code to observe changes to MAXPHYADDR and start/stop
intercepting page faults.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 17:01:29 -04:00
Paolo Bonzini 6986982fef KVM: x86: rename update_bp_intercept to update_exception_bitmap
We would like to introduce a callback to update the #PF intercept
when CPUID changes.  Just reuse update_bp_intercept since VMX is
already using update_exception_bitmap instead of a bespoke function.

While at it, remove an unnecessary assignment in the SVM version,
which is already done in the caller (kvm_arch_vcpu_ioctl_set_guest_debug)
and has nothing to do with the exception bitmap.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 13:20:18 -04:00
Mohammed Gamal ec7771ab47 KVM: x86: mmu: Add guest physical address check in translate_gpa()
Intel processors of various generations have supported 36, 39, 46 or 52
bits for physical addresses.  Until IceLake introduced MAXPHYADDR==52,
running on a machine with higher MAXPHYADDR than the guest more or less
worked, because software that relied on reserved address bits (like KVM)
generally used bit 51 as a marker and therefore the page faults where
generated anyway.

Unfortunately this is not true anymore if the host MAXPHYADDR is 52,
and this can cause problems when migrating from a MAXPHYADDR<52
machine to one with MAXPHYADDR==52.  Typically, the latter are machines
that support 5-level page tables, so they can be identified easily from
the LA57 CPUID bit.

When that happens, the guest might have a physical address with reserved
bits set, but the host won't see that and trap it.  Hence, we need
to check page faults' physical addresses against the guest's maximum
physical memory and if it's exceeded, we need to add the PFERR_RSVD_MASK
bits to the page fault error code.

This patch does this for the MMU's page walks.  The next patches will
ensure that the correct exception and error code is produced whenever
no host-reserved bits are set in page table entries.

Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200710154811.418214-4-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 13:09:59 -04:00
Mohammed Gamal cd313569f5 KVM: x86: mmu: Move translate_gpa() to mmu.c
Also no point of it being inline since it's always called through
function pointers. So remove that.

Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200710154811.418214-3-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 13:09:40 -04:00
Mohammed Gamal 897861479c KVM: x86: Add helper functions for illegal GPA checking and page fault injection
This patch adds two helper functions that will be used to support virtualizing
MAXPHYADDR in both kvm-intel.ko and kvm.ko.

kvm_fixup_and_inject_pf_error() injects a page fault for a user-specified GVA,
while kvm_mmu_is_illegal_gpa() checks whether a GPA exceeds vCPU address limits.

Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200710154811.418214-2-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 13:07:28 -04:00
Vitaly Kuznetsov fe9304d318 KVM: x86: drop superfluous mmu_check_root() from fast_pgd_switch()
The mmu_check_root() check in fast_pgd_switch() seems to be
superfluous: when GPA is outside of the visible range
cached_root_available() will fail for non-direct roots
(as we can't have a matching one on the list) and we don't
seem to care for direct ones.

Also, raising #TF immediately when a non-existent GFN is written to CR3
doesn't seem to mach architectural behavior. Drop the check.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710141157.1640173-10-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 13:03:12 -04:00
Vitaly Kuznetsov d82aaef9c8 KVM: nSVM: use nested_svm_load_cr3() on guest->host switch
Make nSVM code resemble nVMX where nested_vmx_load_cr3() is used on
both guest->host and host->guest transitions. Also, we can now
eliminate unconditional kvm_mmu_reset_context() and speed things up.

Note, nVMX has two different paths: load_vmcs12_host_state() and
nested_vmx_restore_host_state() and the later is used to restore from
'partial' switch to L2, it always uses kvm_mmu_reset_context().
nSVM doesn't have this yet. Also, nested_svm_vmexit()'s return value
is almost always ignored nowadays.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710141157.1640173-9-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 12:59:39 -04:00
Vitaly Kuznetsov a506fdd223 KVM: nSVM: implement nested_svm_load_cr3() and use it for host->guest switch
Undesired triple fault gets injected to L1 guest on SVM when L2 is
launched with certain CR3 values. #TF is raised by mmu_check_root()
check in fast_pgd_switch() and the root cause is that when
kvm_set_cr3() is called from nested_prepare_vmcb_save() with NPT
enabled CR3 points to a nGPA so we can't check it with
kvm_is_visible_gfn().

Using generic kvm_set_cr3() when switching to nested guest is not
a great idea as we'll have to distinguish between 'real' CR3s and
'nested' CR3s to e.g. not call kvm_mmu_new_pgd() with nGPA. Following
nVMX implement nested-specific nested_svm_load_cr3() doing the job.

To support the change, nested_svm_load_cr3() needs to be re-ordered
with nested_svm_init_mmu_context().

Note: the current implementation is sub-optimal as we always do TLB
flush/MMU sync but this is still an improvement as we at least stop doing
kvm_mmu_reset_context().

Fixes: 7c390d350f ("kvm: x86: Add fast CR3 switch code path")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710141157.1640173-8-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 12:57:37 -04:00
Vitaly Kuznetsov bf7dea4253 KVM: nSVM: move kvm_set_cr3() after nested_svm_uninit_mmu_context()
kvm_mmu_new_pgd() refers to arch.mmu and at this point it still references
arch.guest_mmu while arch.root_mmu is expected.

Note, the change is effectively a nop: when !npt_enabled,
nested_svm_uninit_mmu_context() does nothing (as we don't do
nested_svm_init_mmu_context()) and with npt_enabled we don't
do kvm_set_cr3().  However, it will matter when we move the
call to kvm_mmu_new_pgd into nested_svm_load_cr3().

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710141157.1640173-7-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 12:56:30 -04:00
Vitaly Kuznetsov 62156f6cd1 KVM: nSVM: introduce nested_svm_load_cr3()/nested_npt_enabled()
As a preparatory change for implementing nSVM-specific PGD switch
(following nVMX' nested_vmx_load_cr3()), introduce nested_svm_load_cr3()
instead of relying on kvm_set_cr3().

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710141157.1640173-6-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 12:55:36 -04:00
Vitaly Kuznetsov 59cd9bc5b0 KVM: nSVM: prepare to handle errors from enter_svm_guest_mode()
Some operations in enter_svm_guest_mode() may fail, e.g. currently
we suppress kvm_set_cr3() return value. Prepare the code to proparate
errors.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710141157.1640173-5-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 12:55:13 -04:00
Vitaly Kuznetsov ebdb3dba7b KVM: nSVM: reset nested_run_pending upon nested_svm_vmrun_msrpm() failure
WARN_ON_ONCE(svm->nested.nested_run_pending) in nested_svm_vmexit()
will fire if nested_run_pending remains '1' but it doesn't really
need to, we are already failing and not going to run nested guest.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710141157.1640173-4-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 12:54:25 -04:00
Paolo Bonzini 8c008659aa KVM: MMU: stop dereferencing vcpu->arch.mmu to get the context for MMU init
kvm_init_shadow_mmu() was actually the only function that could be called
with different vcpu->arch.mmu values.  Now that kvm_init_shadow_npt_mmu()
is separated from kvm_init_shadow_mmu(), we always know the MMU context
we need to use and there is no need to dereference vcpu->arch.mmu pointer.

Based on a patch by Vitaly Kuznetsov <vkuznets@redhat.com>.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710141157.1640173-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 12:53:58 -04:00
Vitaly Kuznetsov 0f04a2ac4f KVM: nSVM: split kvm_init_shadow_npt_mmu() from kvm_init_shadow_mmu()
As a preparatory change for moving kvm_mmu_new_pgd() from
nested_prepare_vmcb_save() to nested_svm_init_mmu_context() split
kvm_init_shadow_npt_mmu() from kvm_init_shadow_mmu(). This also makes
the code look more like nVMX (kvm_init_shadow_ept_mmu()).

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710141157.1640173-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 12:53:27 -04:00
Vitaly Kuznetsov d574c539c3 KVM: x86: move MSR_IA32_PERF_CAPABILITIES emulation to common x86 code
state_test/smm_test selftests are failing on AMD with:
"Unexpected result from KVM_GET_MSRS, r: 51 (failed MSR was 0x345)"

MSR_IA32_PERF_CAPABILITIES is an emulated MSR on Intel but it is not
known to AMD code, we can move the emulation to common x86 code. For
AMD, we basically just allow the host to read and write zero to the MSR.

Fixes: 27461da310 ("KVM: x86/pmu: Support full width counting")
Suggested-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200710152559.1645827-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 12:52:56 -04:00
Linus Torvalds cb24c61b53 Two simple but important bugfixes.
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8IQ20UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNzxQf+KziWiVgLYnRmGVJ4xztRGv8Cjt+o
 g1YRyJgpST4UEdHyO+T1JvO8HVzTxtDwZlRNHB9UtIaAsAuybSdpUaHeK9lvcZvi
 vd59ItmyHKFOtItfG6Qpj6MJKN9tbN1y2F9Vc+TXNLL9BLHwPTbvF3l5ffdtBJ6F
 zurQBec7hoCarXZJS2GfzBQ+16WxZmm7RLDpYtEqAayp+UHNw+Z1eMrMV6TwdxAA
 3LgDn3l+A+BNuIIUKFF9Y5Ef3T4zBWGbdsV7FR9mH1fq2DiT1Vz4IT674L+6rEom
 /KvyyjVPIfQF33sMZmVpLCpoe2IEmOtc7cu3zffqNL6gNw3YHZwMK6cgsg==
 =abJa
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull vkm fixes from Paolo Bonzini:
 "Two simple but important bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: MIPS: Fix build errors for 32bit kernel
  KVM: nVMX: fixes for preemption timer migration
2020-07-10 08:34:12 -07:00
Paolo Bonzini 83d31e5271 KVM: nVMX: fixes for preemption timer migration
Commit 850448f35a ("KVM: nVMX: Fix VMX preemption timer migration",
2020-06-01) accidentally broke nVMX live migration from older version
by changing the userspace ABI.  Restore it and, while at it, ensure
that vmx->nested.has_preemption_timer_deadline is always initialized
according to the KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE flag.

Cc: Makarand Sonare <makarandsonare@google.com>
Fixes: 850448f35a ("KVM: nVMX: Fix VMX preemption timer migration")
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 06:15:36 -04:00
Peter Zijlstra f9ad4a5f3f lockdep: Remove lockdep_hardirq{s_enabled,_context}() argument
Now that the macros use per-cpu data, we no longer need the argument.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200623083721.571835311@infradead.org
2020-07-10 12:00:02 +02:00
Peter Zijlstra ba1f2b2eaa x86/entry: Fix NMI vs IRQ state tracking
While the nmi_enter() users did
trace_hardirqs_{off_prepare,on_finish}() there was no matching
lockdep_hardirqs_*() calls to complete the picture.

Introduce idtentry_{enter,exit}_nmi() to enable proper IRQ state
tracking across the NMIs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200623083721.216740948@infradead.org
2020-07-10 12:00:01 +02:00
Thomas Gleixner bc916e67c0 Merge branch 'x86/urgent' into x86/entry to pick up upstream fixes. 2020-07-10 11:45:22 +02:00
Sean Christopherson 6926f95acc KVM: Move x86's MMU memory cache helpers to common KVM code
Move x86's memory cache helpers to common KVM code so that they can be
reused by arm64 and MIPS in future patches.

Suggested-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-16-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:42 -04:00
Sean Christopherson 2aa9c199cf KVM: Move x86's version of struct kvm_mmu_memory_cache to common code
Move x86's 'struct kvm_mmu_memory_cache' to common code in anticipation
of moving the entire x86 implementation code to common KVM and reusing
it for arm64 and MIPS.  Add a new architecture specific asm/kvm_types.h
to control the existence and parameters of the struct.  The new header
is needed to avoid a chicken-and-egg problem with asm/kvm_host.h as all
architectures define instances of the struct in their vCPU structs.

Add an asm-generic version of kvm_types.h to avoid having empty files on
PPC and s390 in the long term, and for arm64 and mips in the short term.

Suggested-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-15-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:42 -04:00
Sean Christopherson 94ce87ef81 KVM: x86/mmu: Prepend "kvm_" to memory cache helpers that will be global
Rename the memory helpers that will soon be moved to common code and be
made globaly available via linux/kvm_host.h.  "mmu" alone is not a
sufficient namespace for globally available KVM symbols.

Opportunistically add "nr_" in mmu_memory_cache_free_objects() to make
it clear the function returns the number of free objects, as opposed to
freeing existing objects.

Suggested-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-14-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:41 -04:00
Sean Christopherson 378f5cd64a KVM: x86/mmu: Skip filling the gfn cache for guaranteed direct MMU topups
Don't bother filling the gfn array cache when the caller is a fully
direct MMU, i.e. won't need a gfn array for shadow pages.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-13-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:41 -04:00
Sean Christopherson 9688088378 KVM: x86/mmu: Zero allocate shadow pages (outside of mmu_lock)
Set __GFP_ZERO for the shadow page memory cache and drop the explicit
clear_page() from kvm_mmu_get_page().  This moves the cost of zeroing a
page to the allocation time of the physical page, i.e. when topping up
the memory caches, and thus avoids having to zero out an entire page
while holding mmu_lock.

Cc: Peter Feiner <pfeiner@google.com>
Cc: Peter Shier <pshier@google.com>
Cc: Junaid Shahid <junaids@google.com>
Cc: Jim Mattson <jmattson@google.com>
Suggested-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-12-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:40 -04:00
Sean Christopherson 5f6078f9f1 KVM: x86/mmu: Make __GFP_ZERO a property of the memory cache
Add a gfp_zero flag to 'struct kvm_mmu_memory_cache' and use it to
control __GFP_ZERO instead of hardcoding a call to kmem_cache_zalloc().
A future patch needs such a flag for the __get_free_page() path, as
gfn arrays do not need/want the allocator to zero the memory.  Convert
the kmem_cache paths to __GFP_ZERO now so as to avoid a weird and
inconsistent API in the future.

No functional change intended.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-11-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:40 -04:00
Sean Christopherson 171a90d70f KVM: x86/mmu: Separate the memory caches for shadow pages and gfn arrays
Use separate caches for allocating shadow pages versus gfn arrays.  This
sets the stage for specifying __GFP_ZERO when allocating shadow pages
without incurring extra cost for gfn arrays.

No functional change intended.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-10-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:40 -04:00
Sean Christopherson 531281ad98 KVM: x86/mmu: Clean up the gorilla math in mmu_topup_memory_caches()
Clean up the minimums in mmu_topup_memory_caches() to document the
driving mechanisms behind the minimums.  Now that encountering an empty
cache is unlikely to trigger BUG_ON(), it is less dangerous to be more
precise when defining the minimums.

For rmaps, the logic is 1 parent PTE per level, plus a single rmap, and
prefetched rmaps.  The extra objects in the current '8 + PREFETCH'
minimum came about due to an abundance of paranoia in commit
c41ef344de ("KVM: MMU: increase per-vcpu rmap cache alloc size"),
i.e. it could have increased the minimum to 2 rmaps.  Furthermore, the
unexpected extra rmap case was killed off entirely by commits
f759e2b4c7 ("KVM: MMU: avoid pte_list_desc running out in
kvm_mmu_pte_write") and f5a1e9f895 ("KVM: MMU: remove call to
kvm_mmu_pte_write from walk_addr").

For the so called page cache, replace '8' with 2*PT64_ROOT_MAX_LEVEL.
The 2x multiplier is needed because the cache is used for both shadow
pages and gfn arrays for indirect MMUs.

And finally, for page headers, replace '4' with PT64_ROOT_MAX_LEVEL.

Note, KVM now supports 5-level paging, i.e. the old minimums that used a
baseline derived from 4-level paging were technically wrong.  But, KVM
always allocates roots in a separate flow, e.g. it's impossible in the
current implementation to actually need 5 new shadow pages in a single
flow.  Use PT64_ROOT_MAX_LEVEL unmodified instead of subtracting 1, as
the direct usage is likely more intuitive to uninformed readers, and the
inflated minimum is unlikely to affect functionality in practice.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-9-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:39 -04:00
Sean Christopherson f3747a5a9e KVM: x86/mmu: Topup memory caches after walking GVA->GPA
Topup memory caches after walking the GVA->GPA translation during a
shadow page fault, there is no need to ensure the caches are full when
walking the GVA.  As of commit f5a1e9f895 ("KVM: MMU: remove call
to kvm_mmu_pte_write from walk_addr"), the FNAME(walk_addr) flow no
longer add rmaps via kvm_mmu_pte_write().

This avoids allocating memory in the case that the GVA is unmapped in
the guest, and also provides a paper trail of why/when the memory caches
need to be filled.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-8-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:39 -04:00
Sean Christopherson 832914452a KVM: x86/mmu: Move fast_page_fault() call above mmu_topup_memory_caches()
Avoid refilling the memory caches and potentially slow reclaim/swap when
handling a fast page fault, which does not need to allocate any new
objects.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-7-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:39 -04:00
Sean Christopherson 53a3f48771 KVM: x86/mmu: Try to avoid crashing KVM if a MMU memory cache is empty
Attempt to allocate a new object instead of crashing KVM (and likely the
kernel) if a memory cache is unexpectedly empty.  Use GFP_ATOMIC for the
allocation as the caches are used while holding mmu_lock.  The immediate
BUG_ON() makes the code unnecessarily explosive and led to confusing
minimums being used in the past, e.g. allocating 4 objects where 1 would
suffice.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-6-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:38 -04:00
Sean Christopherson 284aa86868 KVM: x86/mmu: Remove superfluous gotos from mmu_topup_memory_caches()
Return errors directly from mmu_topup_memory_caches() instead of
branching to a label that does the same.

No functional change intended.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-5-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:38 -04:00
Sean Christopherson 356ec69adf KVM: x86/mmu: Use consistent "mc" name for kvm_mmu_memory_cache locals
Use "mc" for local variables to shorten line lengths and provide
consistent names, which will be especially helpful when some of the
helpers are moved to common KVM code in future patches.

No functional change intended.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-4-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:37 -04:00
Sean Christopherson 45177cccd9 KVM: x86/mmu: Consolidate "page" variant of memory cache helpers
Drop the "page" variants of the topup/free memory cache helpers, using
the existence of an associated kmem_cache to select the correct alloc
or free routine.

No functional change intended.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-3-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:37 -04:00
Sean Christopherson 5962bfb748 KVM: x86/mmu: Track the associated kmem_cache in the MMU caches
Track the kmem_cache used for non-page KVM MMU memory caches instead of
passing in the associated kmem_cache when filling the cache.  This will
allow consolidating code and other cleanups.

No functional change intended.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:37 -04:00
Thomas Gleixner 2245d39886 x86/kvm/vmx: Use native read/write_cr2()
read/write_cr2() go throuh the paravirt XXL indirection, but nested VMX in
a XEN_PV guest is not supported.

Use the native variants.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Message-Id: <20200708195322.344731916@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:42 -04:00
Thomas Gleixner c3f08ed150 x86/kvm/svm: Use uninstrumented wrmsrl() to restore GS
On guest exit MSR_GS_BASE contains whatever the guest wrote to it and the
first action after returning from the ASM code is to set it to the host
kernel value. This uses wrmsrl() which is interesting at least.

wrmsrl() is either using native_write_msr() or the paravirt variant. The
XEN_PV code is uninteresting as nested SVM in a XEN_PV guest does not work.

But native_write_msr() can be placed out of line by the compiler especially
when paravirtualization is enabled in the kernel configuration. The
function is marked notrace, but still can be probed if
CONFIG_KPROBE_EVENTS_ON_NOTRACE is enabled.

That would be a fatal problem as kprobe events use per-CPU variables which
are GS based and would be accessed with the guest GS. Depending on the GS
value this would either explode in colorful ways or lead to completely
undebugable data corruption.

Aside of that native_write_msr() contains a tracepoint which objtool
complains about as it is invoked from the noinstr section.

As this cannot run inside a XEN_PV guest there is no point in using
wrmsrl(). Use native_wrmsrl() instead which is just a plain native WRMSR
without tracing or anything else attached.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Juergen Gross <jgross@suse.com>
Message-Id: <20200708195322.244847377@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:41 -04:00
Thomas Gleixner 135961e0a7 x86/kvm/svm: Move guest enter/exit into .noinstr.text
Move the functions which are inside the RCU off region into the
non-instrumentable text section.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Message-Id: <20200708195322.144607767@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:41 -04:00
Thomas Gleixner 3ebccdf373 x86/kvm/vmx: Move guest enter/exit into .noinstr.text
Move the functions which are inside the RCU off region into the
non-instrumentable text section.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Message-Id: <20200708195322.037311579@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:40 -04:00
Thomas Gleixner 9fc975e9ef x86/kvm/svm: Add hardirq tracing on guest enter/exit
Entering guest mode is more or less the same as returning to user
space. From an instrumentation point of view both leave kernel mode and the
transition to guest or user mode reenables interrupts on the host. In user
mode an interrupt is served directly and in guest mode it causes a VM exit
which then handles or reinjects the interrupt.

The transition from guest mode or user mode to kernel mode disables
interrupts, which needs to be recorded in instrumentation to set the
correct state again.

This is important for e.g. latency analysis because otherwise the execution
time in guest or user mode would be wrongly accounted as interrupt disabled
and could trigger false positives.

Add hardirq tracing to guest enter/exit functions in the same way as it
is done in the user mode enter/exit code, respecting the RCU requirements.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Message-Id: <20200708195321.934715094@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:39 -04:00
Thomas Gleixner 0642391e21 x86/kvm/vmx: Add hardirq tracing to guest enter/exit
Entering guest mode is more or less the same as returning to user
space. From an instrumentation point of view both leave kernel mode and the
transition to guest or user mode reenables interrupts on the host. In user
mode an interrupt is served directly and in guest mode it causes a VM exit
which then handles or reinjects the interrupt.

The transition from guest mode or user mode to kernel mode disables
interrupts, which needs to be recorded in instrumentation to set the
correct state again.

This is important for e.g. latency analysis because otherwise the execution
time in guest or user mode would be wrongly accounted as interrupt disabled
and could trigger false positives.

Add hardirq tracing to guest enter/exit functions in the same way as it
is done in the user mode enter/exit code, respecting the RCU requirements.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Message-Id: <20200708195321.822002354@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:39 -04:00
Thomas Gleixner 87fa7f3e98 x86/kvm: Move context tracking where it belongs
Context tracking for KVM happens way too early in the vcpu_run()
code. Anything after guest_enter_irqoff() and before guest_exit_irqoff()
cannot use RCU and should also be not instrumented.

The current way of doing this covers way too much code. Move it closer to
the actual vmenter/exit code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200708195321.724574345@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:38 -04:00
Maxim Levitsky 841c2be09f kvm: x86: replace kvm_spec_ctrl_test_value with runtime test on the host
To avoid complex and in some cases incorrect logic in
kvm_spec_ctrl_test_value, just try the guest's given value on the host
processor instead, and if it doesn't #GP, allow the guest to set it.

One such case is when host CPU supports STIBP mitigation
but doesn't support IBRS (as is the case with some Zen2 AMD cpus),
and in this case we were giving guest #GP when it tried to use STIBP

The reason why can can do the host test is that IA32_SPEC_CTRL msr is
passed to the guest, after the guest sets it to a non zero value
for the first time (due to performance reasons),
and as as result of this, it is pointless to emulate #GP condition on
this first access, in a different way than what the host CPU does.

This is based on a patch from Sean Christopherson, who suggested this idea.

Fixes: 6441fa6178 ("KVM: x86: avoid incorrect writes to host MSR_IA32_SPEC_CTRL")
Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200708115731.180097-1-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:38 -04:00
Like Xu 632a4cf57f KVM/x86: pmu: Fix #GP condition check for RDPMC emulation
In guest protected mode, if the current privilege level
is not 0 and the PCE flag in the CR4 register is cleared,
we will inject a #GP for RDPMC usage.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20200708074409.39028-1-like.xu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:37 -04:00
Vitaly Kuznetsov 995decb6c4 KVM: x86: take as_id into account when checking PGD
OVMF booted guest running on shadow pages crashes on TRIPLE FAULT after
enabling paging from SMM. The crash is triggered from mmu_check_root() and
is caused by kvm_is_visible_gfn() searching through memslots with as_id = 0
while vCPU may be in a different context (address space).

Introduce kvm_vcpu_is_visible_gfn() and use it from mmu_check_root().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200708140023.1476020-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:37 -04:00
Xiaoyao Li 5668821aef KVM: x86: Move kvm_x86_ops.vcpu_after_set_cpuid() into kvm_vcpu_after_set_cpuid()
kvm_x86_ops.vcpu_after_set_cpuid() is used to update vmx/svm specific
vcpu settings based on updated CPUID settings. So it's supposed to be
called after CPUIDs are updated, i.e., kvm_update_cpuid_runtime().

Currently, kvm_update_cpuid_runtime() only updates CPUID bits of OSXSAVE,
APIC, OSPKE, MWAIT, KVM_FEATURE_PV_UNHALT and CPUID(0xD,0).ebx and
CPUID(0xD, 1).ebx. None of them is consumed by vmx/svm's
update_vcpu_after_set_cpuid(). So there is no dependency between them.

Move kvm_x86_ops.vcpu_after_set_cpuid() into kvm_vcpu_after_set_cpuid() is
obviously more reasonable.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200709043426.92712-6-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:25 -04:00
Xiaoyao Li 7c1b761be0 KVM: x86: Rename cpuid_update() callback to vcpu_after_set_cpuid()
The name of callback cpuid_update() is misleading that it's not about
updating CPUID settings of vcpu but updating the configurations of vcpu
based on the CPUIDs. So rename it to vcpu_after_set_cpuid().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200709043426.92712-5-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:18 -04:00
Xiaoyao Li 346ce3591d KVM: x86: Rename kvm_update_cpuid() to kvm_vcpu_after_set_cpuid()
Now there is no updating CPUID bits behavior in kvm_update_cpuid(),
rename it to kvm_vcpu_after_set_cpuid().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200709043426.92712-4-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 07:08:08 -04:00
Xiaoyao Li aedbaf4f6a KVM: x86: Extract kvm_update_cpuid_runtime() from kvm_update_cpuid()
Beside called in kvm_vcpu_ioctl_set_cpuid*(), kvm_update_cpuid() is also
called 5 places else in x86.c and 1 place else in lapic.c. All those 6
places only need the part of updating guest CPUIDs (OSXSAVE, OSPKE, APIC,
KVM_FEATURE_PV_UNHALT, ...) based on the runtime vcpu state, so extract
them as a separate kvm_update_cpuid_runtime().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200709043426.92712-3-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 06:53:49 -04:00
Xiaoyao Li a76733a987 KVM: x86: Introduce kvm_check_cpuid()
Use kvm_check_cpuid() to validate if userspace provides legal cpuid
settings and call it before KVM take any action to update CPUID or
update vcpu states based on given CPUID settings.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200709043426.92712-2-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 06:49:14 -04:00
Xiaoyao Li 36f37648ca KVM: X86: Move kvm_apic_set_version() to kvm_update_cpuid()
There is no dependencies between kvm_apic_set_version() and
kvm_update_cpuid() because kvm_apic_set_version() queries X2APIC CPUID bit,
which is not touched/changed by kvm_update_cpuid().

Obviously, kvm_apic_set_version() belongs to the category of updating
vcpu model.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200708065054.19713-9-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 06:48:09 -04:00
Thomas Gleixner bd87e6f661 x86/entry/common: Make prepare_exit_to_usermode() static
No users outside this file anymore.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200708192934.301116609@linutronix.de
2020-07-09 11:18:30 +02:00
Thomas Gleixner 006e1ced51 x86/entry: Mark check_user_regs() noinstr
It's called from the non-instrumentable section.

Fixes: c9c26150e6 ("x86/entry: Assert that syscalls are on the right stack")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200708192934.191497962@linutronix.de
2020-07-09 11:18:29 +02:00
Thomas Gleixner bce9b042ec x86/traps: Disable interrupts in exc_aligment_check()
exc_alignment_check() fails to disable interrupts before returning to the
entry code.

Fixes: ca4c6a9858 ("x86/traps: Make interrupt enable/disable symmetric in C code")
Reported-by: syzbot+0889df9502bc0f112b31@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200708192934.076519438@linutronix.de
2020-07-09 11:18:29 +02:00
Sedat Dilek 3347c8a079 crypto: aesni - Fix build with LLVM_IAS=1
When building with LLVM_IAS=1 means using Clang's Integrated Assembly (IAS)
from LLVM/Clang >= v10.0.1-rc1+ instead of GNU/as from GNU/binutils
I see the following breakage in Debian/testing AMD64:

<instantiation>:15:74: error: too many positional arguments
 PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
                                                                         ^
 arch/x86/crypto/aesni-intel_asm.S:1598:2: note: while in macro instantiation
 GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp)
 ^
<instantiation>:47:2: error: unknown use of instruction mnemonic without a size suffix
 GHASH_4_ENCRYPT_4_PARALLEL_dec %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc
 ^
arch/x86/crypto/aesni-intel_asm.S:1599:2: note: while in macro instantiation
 GCM_ENC_DEC dec
 ^
<instantiation>:15:74: error: too many positional arguments
 PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
                                                                         ^
arch/x86/crypto/aesni-intel_asm.S:1686:2: note: while in macro instantiation
 GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp)
 ^
<instantiation>:47:2: error: unknown use of instruction mnemonic without a size suffix
 GHASH_4_ENCRYPT_4_PARALLEL_enc %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc
 ^
arch/x86/crypto/aesni-intel_asm.S:1687:2: note: while in macro instantiation
 GCM_ENC_DEC enc

Craig Topper suggested me in ClangBuiltLinux issue #1050:

> I think the "too many positional arguments" is because the parser isn't able
> to handle the trailing commas.
>
> The "unknown use of instruction mnemonic" is because the macro was named
> GHASH_4_ENCRYPT_4_PARALLEL_DEC but its being instantiated with
> GHASH_4_ENCRYPT_4_PARALLEL_dec I guess gas ignores case on the
> macro instantiation, but llvm doesn't.

First, I removed the trailing comma in the PRECOMPUTE line.

Second, I substituted:
1. GHASH_4_ENCRYPT_4_PARALLEL_DEC -> GHASH_4_ENCRYPT_4_PARALLEL_dec
2. GHASH_4_ENCRYPT_4_PARALLEL_ENC -> GHASH_4_ENCRYPT_4_PARALLEL_enc

With these changes I was able to build with LLVM_IAS=1 and boot on bare metal.

I confirmed that this works with Linux-kernel v5.7.5 final.

NOTE: This patch is on top of Linux v5.7 final.

Thanks to Craig and especially Nick for double-checking and his comments.

Suggested-by: Craig Topper <craig.topper@intel.com>
Suggested-by: Craig Topper <craig.topper@gmail.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: "ClangBuiltLinux" <clang-built-linux@googlegroups.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1050
Link: https://bugs.llvm.org/show_bug.cgi?id=24494
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:25:22 +10:00
Xiaoyao Li 565b782073 KVM: lapic: Use guest_cpuid_has() in kvm_apic_set_version()
Only code cleanup and no functional change.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200708065054.19713-8-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:22:01 -04:00
Xiaoyao Li 0d3b2ba16b KVM: X86: Go on updating other CPUID leaves when leaf 1 is absent
As handling of bits out of leaf 1 added over time, kvm_update_cpuid()
should not return directly if leaf 1 is absent, but should go on
updateing other CPUID leaves.

Keep the update of apic->lapic_timer.timer_mode_mask in a separate
wrapper, to minimize churn for code since it will be moved out of this
function in a future patch.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200708065054.19713-3-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:22:00 -04:00
Xiaoyao Li 1896409282 KVM: X86: Reset vcpu->arch.cpuid_nent to 0 if SET_CPUID* fails
Current implementation keeps userspace input of CPUID configuration and
cpuid->nent even if kvm_update_cpuid() fails. Reset vcpu->arch.cpuid_nent
to 0 for the case of failure as a simple fix.

Besides, update the doc to explicitly state that if IOCTL SET_CPUID*
fail KVM gives no gurantee that previous valid CPUID configuration is
kept.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200708065054.19713-2-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:22:00 -04:00
Like Xu 2e8cd7a3b8 kvm: x86: limit the maximum number of vPMU fixed counters to 3
Some new Intel platforms (such as TGL) already have the
fourth fixed counter TOPDOWN.SLOTS, but it has not been
fully enabled on KVM and the host.

Therefore, we limit edx.split.num_counters_fixed to 3,
so that it does not break the kvm-unit-tests PMU test
case and bad-handled userspace.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20200624015928.118614-1-like.xu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:59 -04:00
Krish Sadhukhan 761e416934 KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests
According to section "Canonicalization and Consistency Checks" in APM vol. 2
the following guest state is illegal:

    "Any MBZ bit of CR3 is set."
    "Any MBZ bit of CR4 is set."

Suggeted-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Message-Id: <1594168797-29444-3-git-send-email-krish.sadhukhan@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:59 -04:00
Paolo Bonzini 53efe527ca KVM: x86: Make CR4.VMXE reserved for the guest
CR4.VMXE is reserved unless the VMX CPUID bit is set.  On Intel,
it is also tested by vmx_set_cr4, but AMD relies on kvm_valid_cr4,
so fix it.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:58 -04:00
Krish Sadhukhan b899c13277 KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid()
Instead of creating the mask for guest CR4 reserved bits in kvm_valid_cr4(),
do it in kvm_update_cpuid() so that it can be reused instead of creating it
each time kvm_valid_cr4() is called.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Message-Id: <1594168797-29444-2-git-send-email-krish.sadhukhan@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:58 -04:00
Jim Mattson d42e3fae6f kvm: x86: Read PDPTEs on CR0.CD and CR0.NW changes
According to the SDM, when PAE paging would be in use following a
MOV-to-CR0 that modifies any of CR0.CD, CR0.NW, or CR0.PG, then the
PDPTEs are loaded from the address in CR3. Previously, kvm only loaded
the PDPTEs when PAE paging would be in use following a MOV-to-CR0 that
modified CR0.PG.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Message-Id: <20200707223630.336700-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:58 -04:00