1
0
Fork 0
alistair23-linux/arch/x86
Fenghua Yu ae28d1aae4 x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR
Currently, when moving a task to a resource group the PQR_ASSOC MSR is
updated with the new closid and rmid in an added task callback. If the
task is running, the work is run as soon as possible. If the task is not
running, the work is executed later in the kernel exit path when the
kernel returns to the task again.

Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task
is running is the right thing to do. Queueing work for a task that is
not running is unnecessary (the PQR_ASSOC MSR is already updated when
the task is scheduled in) and causing system resource waste with the way
in which it is implemented: Work to update the PQR_ASSOC register is
queued every time the user writes a task id to the "tasks" file, even if
the task already belongs to the resource group.

This could result in multiple pending work items associated with a
single task even if they are all identical and even though only a single
update with most recent values is needed. Specifically, even if a task
is moved between different resource groups while it is sleeping then it
is only the last move that is relevant but yet a work item is queued
during each move.

This unnecessary queueing of work items could result in significant
system resource waste, especially on tasks sleeping for a long time.
For example, as demonstrated by Shakeel Butt in [1] writing the same
task id to the "tasks" file can quickly consume significant memory. The
same problem (wasted system resources) occurs when moving a task between
different resource groups.

As pointed out by Valentin Schneider in [2] there is an additional issue
with the way in which the queueing of work is done in that the task_struct
update is currently done after the work is queued, resulting in a race with
the register update possibly done before the data needed by the update is
available.

To solve these issues, update the PQR_ASSOC MSR in a synchronous way
right after the new closid and rmid are ready during the task movement,
only if the task is running. If a moved task is not running nothing
is done since the PQR_ASSOC MSR will be updated next time the task is
scheduled. This is the same way used to update the register when tasks
are moved as part of resource group removal.

[1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/
[2] https://lore.kernel.org/lkml/20201123022433.17905-1-valentin.schneider@arm.com

 [ bp: Massage commit message and drop the two update_task_closid_rmid()
   variants. ]

Fixes: e02737d5b8 ("x86/intel_rdt: Add tasks files")
Reported-by: Shakeel Butt <shakeelb@google.com>
Reported-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/17aa2fb38fc12ce7bb710106b3e7c7b45acb9e94.1608243147.git.reinette.chatre@intel.com
2021-01-08 09:03:36 +01:00
..
boot EFI updates collected by Ard Biesheuvel: 2020-12-24 12:40:07 -08:00
configs * A defconfig fix, from Daniel Díaz. 2020-09-20 15:06:43 -07:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2020-12-14 12:18:19 -08:00
entry epoll: wire up syscall epoll_pwait2 2020-12-19 11:18:38 -08:00
events Perf updates: 2020-12-14 17:34:12 -08:00
hyperv hyperv-fixes for 5.10-rc3 2020-11-05 11:32:03 -08:00
ia32 x86/ia32_signal: Propagate __user annotation properly 2020-12-11 19:44:31 +01:00
include EFI updates collected by Ard Biesheuvel: 2020-12-24 12:40:07 -08:00
kernel x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR 2021-01-08 09:03:36 +01:00
kvm ARM: 2020-12-20 10:44:05 -08:00
lib Scheduler updates: 2020-12-14 18:29:11 -08:00
math-emu treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
mm x86/mm: Fix leak of pmd ptlock 2021-01-05 11:40:23 +01:00
net bpf: x64: Do not emit sub/add 0, %rsp when !stack_depth 2020-09-29 16:47:39 -07:00
oprofile x86/oprofile: Avoid TIF_IA32 when checking 64bit mode 2020-10-26 13:46:46 +01:00
pci ARM: SoC drivers for v5.11 2020-12-16 16:38:41 -08:00
platform Yet another large set of x86 interrupt management updates: 2020-12-14 18:59:53 -08:00
power Kbuild updates for v5.9 2020-08-09 14:10:26 -07:00
purgatory crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
ras treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
realmode x86/head/64: Don't call verify_cpu() on starting APs 2020-09-09 11:33:20 +02:00
tools x86/insn: Make inat-tables.c suitable for pre-decompression code 2020-09-07 19:45:24 +02:00
um arch/um: partially revert the conversion to __section() macro 2020-10-26 15:39:37 -07:00
video
xen EFI updates collected by Ard Biesheuvel: 2020-12-24 12:40:07 -08:00
.gitignore
Kbuild
Kconfig fanotify: Fix sys_fanotify_mark() on native x86-32 2020-12-28 11:58:59 +01:00
Kconfig.assembler
Kconfig.cpu treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Kconfig.debug x86, libnvdimm/test: Remove COPY_MC_TEST 2020-10-26 18:08:35 +01:00
Makefile - Fix the vmlinux size check on 64-bit along with adding useful clarifications on the topic 2020-12-14 13:54:50 -08:00
Makefile.um
Makefile_32.cpu