remarkable-linux/arch/s390/include/asm
Christian Borntraeger fb3d1c085c s390: let the compiler do page clearing
The hardware folks told me that for page clearing "when you exactly
know what to do, hand written xc+pfd is usally faster then mvcl for
page clearing, as it saves millicode overhead and parameter parsing
and checking" as long as you dont need the cache bypassing.
Turns out that gcc already does a proper xc,pfd loop.

A small test on z196 that does

buff = mmap(NULL, bufsize,PROT_EXEC|PROT_WRITE|PROT_READ,AP_PRIVATE| MAP_ANONYMOUS,0,0);
for ( i = 0; i < bufsize; i+= 256)
    buff[i] = 0x5;

gets 20% faster (touches every cache line of a page)

and

buff = mmap(NULL, bufsize,PROT_EXEC|PROT_WRITE|PROT_READ,AP_PRIVATE| MAP_ANONYMOUS,0,0);
for ( i = 0; i < bufsize; i+= 4096)
    buff[i] = 0x5;

is within noise ratio (touches one cache line of a page).

As the clear_page is usually called for first memory accesses
we can assume that at least one cache line is used afterwards,
so this change should be always better.
Another benchmark, a make -j 40 of my testsuite in tmpfs with
hot caches on a 32cpu system:

 -- unpatched --       --  patched  --
real     0m1.017s     real     0m0.994s   (~2% faster, but in noise)
user     0m5.339s     user     0m5.016s   (~6% faster)
sys      0m0.691s     sys      0m0.632s   (~8% faster)

Let use the same define to memset as the asm-generic variant

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-02-26 09:24:49 +01:00
..
airq.h s390/airq: add support for irq ranges 2014-03-04 10:41:04 +01:00
appldata.h
asm-offsets.h
atomic.h arch,s390: Convert smp_mb__*() 2014-04-18 14:20:42 +02:00
barrier.h arch: Add lightweight memory barriers dma_rmb() and dma_wmb() 2014-12-11 21:15:06 -05:00
bitops.h s390/bitops,atomic: add missing memory barriers 2014-04-01 09:23:35 +02:00
bug.h
bugs.h
cache.h
cacheflush.h mm/debug_pagealloc: fix build failure on ppc and some other archs 2015-02-05 13:35:30 -08:00
ccwdev.h s390/cio: fix multiple structure definitions 2014-05-20 08:58:53 +02:00
ccwgroup.h s390: fix new ccwgroup.h kernel-doc warning 2014-05-20 08:58:45 +02:00
checksum.h s390/checksum: remove memset() within csum_partial_copy_from_user() 2014-02-24 17:14:08 +01:00
chpid.h s390/cio: fix multiple structure definitions 2014-05-20 08:58:53 +02:00
cio.h treewide: Fix typo in Documentation/DocBook 2014-02-19 14:58:17 +01:00
clp.h
cmb.h
cmpxchg.h s390/cmpxchg: use compiler builtins 2014-11-03 13:29:47 +01:00
compat.h s390/compat: build error for large compat syscall args 2014-03-06 16:30:47 +01:00
cpcmd.h
cpu.h
cpu_mf.h s390: add SMT support 2015-01-22 12:16:01 +01:00
cputime.h s390/cputime: fix 31-bit compile 2014-12-08 14:03:43 +01:00
crw.h
css_chars.h s390/qdio: bridgeport support - CHSC part 2014-01-15 14:48:01 -08:00
ctl_reg.h s390/ctl_reg: add union type for control register 0 2014-04-22 13:24:36 +02:00
current.h
debug.h s390/debug: avoid function call for debug_sprintf_* 2014-12-08 09:42:29 +01:00
delay.h
device.h
diag.h
dis.h s390/disassembler: add vector instructions 2014-10-09 09:14:15 +02:00
div64.h
dma-mapping.h s390: Implement dma_{alloc,free}_attrs() 2014-08-26 07:39:12 +02:00
dma.h
eadm.h s390/scm_block: do not hide eadm subchannel dependency 2013-11-15 14:08:42 +01:00
ebcdic.h
elf.h s390: avoid z13 cache aliasing 2015-01-22 12:15:59 +01:00
emergency-restart.h
etr.h
exec.h
extmem.h
facility.h
fb.h
fcx.h s390/cio: fix error-prone defines 2013-10-24 17:17:04 +02:00
ftrace.h s390/ftrace: hotpatch support for function tracing 2015-01-29 09:19:25 +01:00
futex.h s390/uaccess: simplify control register updates 2014-05-20 08:58:46 +02:00
hardirq.h hardirq: Make hardirq bits generic 2013-11-13 20:21:46 +01:00
hugetlb.h s390/mm: cleanup page table definitions 2013-08-22 12:20:06 +02:00
hw_irq.h s390: convert interrupt handling to use generic hardirq 2013-08-22 12:20:04 +02:00
idals.h
idle.h s390/idle: convert open coded idle time seqcount 2014-12-08 09:42:32 +01:00
io.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2014-12-11 17:30:55 -08:00
ipl.h s390/kdump: add support for vector extension 2014-10-09 09:14:16 +02:00
irq.h s390/irq: use irq 0 2014-11-18 18:23:03 +01:00
irq_regs.h
irqflags.h s390/kernel: use stnsm 255 instead of stosm 0 2014-12-18 13:37:15 +01:00
isc.h
itcw.h
jump_label.h s390/jump label: use different nop instruction 2015-01-29 16:33:34 +01:00
Kbuild net, lib: kill arch_fast_hash library bits 2014-12-10 15:17:46 -05:00
kdebug.h
kexec.h
kmap_types.h
kprobes.h s390/ftrace,kprobes: allow to patch first instruction 2014-10-27 13:27:27 +01:00
kvm_host.h KVM: s390: add cpu model support 2015-02-09 12:44:13 +01:00
kvm_para.h
linkage.h
local.h
local64.h
lowcore.h s390/ftrace,kprobes: allow to patch first instruction 2014-10-27 13:27:27 +01:00
mathemu.h
mman.h
mmu.h KVM: s390: Adding skey bit to mmu context 2014-04-22 09:36:23 +02:00
mmu_context.h mm: Make arch_unmap()/bprm_mm_init() available to all architectures 2014-11-19 11:54:13 +01:00
module.h
mutex.h mutex: replace CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX with simple ifdef 2013-09-28 12:46:21 +02:00
nmi.h s390: add support for vector extension 2014-10-09 09:14:13 +02:00
os_info.h
page.h s390: let the compiler do page clearing 2015-02-26 09:24:49 +01:00
pci.h s390/pci: improve irq number check for msix 2014-11-03 13:30:12 +01:00
pci_clp.h s390/pci: add some new arch specific pci attributes 2014-05-20 08:58:50 +02:00
pci_debug.h s390/pci: remove CONFIG_PCI_DEBUG dependancy 2013-10-24 17:17:16 +02:00
pci_dma.h
pci_insn.h s390/pci: cleanup function information block 2013-10-24 17:17:17 +02:00
pci_io.h s390: add pci_iomap_range 2015-01-21 16:28:49 +10:30
percpu.h s390: Replace __get_cpu_var uses 2014-08-26 13:45:52 -04:00
perf_event.h s390/cpum_sf: Add flag to process full SDBs only 2013-12-16 14:38:01 +01:00
pgalloc.h 3.19 changes for KVM: 2014-12-18 16:05:28 -08:00
pgtable.h Merge branch 'akpm' (patches from Andrew) 2015-02-11 18:23:28 -08:00
processor.h s390: reintroduce diag 44 calls for cpu_relax() 2015-01-29 09:19:16 +01:00
ptrace.h s390/uprobes: architecture backend for uprobes 2014-09-25 10:52:17 +02:00
qdio.h s390/qdio: add helpers to manage qdio buffers 2014-07-22 09:26:13 +02:00
reset.h s390: add SMT support 2015-01-22 12:16:01 +01:00
runtime_instr.h
rwsem.h
schid.h
sclp.h Fairly small update, but there are some interesting new features. 2015-02-13 09:55:09 -08:00
scsw.h
seccomp.h
sections.h
segment.h
serial.h s390: convert interrupt handling to use generic hardirq 2013-08-22 12:20:04 +02:00
setup.h s390/spinlock: add compare-and-delay to lock wait loops 2015-01-23 15:17:04 +01:00
sfp-machine.h
sfp-util.h
shmparam.h
signal.h
sigp.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2015-02-11 17:42:32 -08:00
smp.h s390: add SMT support 2015-01-22 12:16:01 +01:00
sparsemem.h
spinlock.h s390/cmpxchg: use compiler builtins 2014-11-03 13:29:47 +01:00
spinlock_types.h s390/rwlock: use directed yield for write-locked rwlocks 2014-09-25 10:52:05 +02:00
string.h lib/string.c: remove strnicmp() 2015-02-12 18:54:14 -08:00
switch_to.h s390/kdump: add support for vector extension 2014-10-09 09:14:16 +02:00
syscall.h s390/seccomp: fix error return for filtered system calls 2014-07-28 10:02:31 +02:00
sysinfo.h Fairly small update, but there are some interesting new features. 2015-02-13 09:55:09 -08:00
termios.h
thread_info.h all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
timex.h s390/timex: fix get_tod_clock_ext() inline assembly 2015-01-07 09:52:47 +01:00
tlb.h s390/mm: fix memory leak of ptlock in pmd_free_tlb 2014-12-08 09:42:40 +01:00
tlbflush.h s390/mm,tlb: optimize TLB flushing for zEC12 2014-04-03 14:31:00 +02:00
topology.h s390/topology: convert cpu_topology array to per cpu variable 2015-02-12 09:37:22 +01:00
types.h
uaccess.h s390/uaccess: provide inline variants of get_user/put_user 2014-05-20 08:58:50 +02:00
unaligned.h
unistd.h
uprobes.h s390/uprobes: architecture backend for uprobes 2014-09-25 10:52:17 +02:00
user.h
vdso.h s390/vdso: add vdso support for coarse clocks 2014-09-09 08:53:27 +02:00
vga.h
vtime.h vtime: Describe overriden functions in dedicated arch headers 2013-08-14 17:14:53 +02:00
vtimer.h s390/idle: consolidate idle functions and definitions 2014-10-09 09:14:03 +02:00
xor.h