remarkable-linux/arch/mips
Chen Jie 615eb603f4 MIPS: csum_partial: Improve instruction parallelism.
Computing sum introduces true data dependency. This patch removes some
true data depdendencies, hence increases instruction level parallelism.

This patch brings up to 50% csum performance gain on Loongson 3a.

One example about how this patch works is in CSUM_BIGCHUNK1:
// ** original **    vs    ** patch applied **
    ADDC(sum, t0)           ADDC(t0, t1)
    ADDC(sum, t1)           ADDC(t2, t3)
    ADDC(sum, t2)           ADDC(sum, t0)
    ADDC(sum, t3)           ADDC(sum, t2)

In the original implementation, each ADDC(sum, ...) depends on the sum
value updated by previous ADDC(as source operand).

With this patch applied, the first two ADDC operations are independent,
hence can be executed simultaneously if possible.

Another example is in the "copy and sum calculating chunk":
// ** original **    vs    ** patch applied **
    STORE(t0, UNIT(0) ...   STORE(t0, UNIT(0) ...
    ADDC(sum, t0)           ADDC(t0, t1)
    STORE(t1, UNIT(1) ...   STORE(t1, UNIT(1) ...
    ADDC(sum, t1)           ADDC(sum, t0)
    STORE(t2, UNIT(2) ...   STORE(t2, UNIT(2) ...
    ADDC(sum, t2)           ADDC(t2, t3)
    STORE(t3, UNIT(3) ...   STORE(t3, UNIT(3) ...
    ADDC(sum, t3)           ADDC(sum, t2)

With this patch applied, ADDC and the **next next** ADDC are independent.

Signed-off-by: chenj <chenj@lemote.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9608/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-04-01 17:22:11 +02:00
..
alchemy Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2015-02-21 19:41:38 -08:00
ar7 mips: Convert pr_warning to pr_warn 2014-11-24 07:44:51 +01:00
ath25 MIPS: ath25: add Wireless device support 2014-11-24 07:45:29 +01:00
ath79 MIPS: ath79: Increase max memory limit to 256MByte 2015-04-01 17:21:57 +02:00
bcm47xx MIPS: BCM47XX: Fix coding style to match kernel standards 2015-04-01 17:22:10 +02:00
bcm63xx MIPS: Remove useless parentheses 2014-11-24 07:44:49 +01:00
bmips MIPS: BMIPS: restrict DTB selection to BMIPS_GENERIC 2015-04-01 17:22:04 +02:00
boot MIPS: OCTEON: add GPIO LED support for DSR-1000N 2015-04-01 17:22:10 +02:00
cavium-octeon MIPS: OCTEON: add GPIO LED support for DSR-1000N 2015-04-01 17:22:10 +02:00
cobalt MIPS: Cobalt: Move to 8250/16550 serial early printk driver 2013-10-29 21:24:38 +01:00
configs MIPS: XPA: Add new configuration file. 2015-04-01 17:21:45 +02:00
dec Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2014-06-09 18:10:34 -07:00
emma MIPS: Remove panic_timeout settings 2013-11-26 12:12:27 +01:00
fw MIPS: ARC: Use __noreturn / unreachable in ARC termination functions. 2015-01-13 16:04:27 +01:00
include MIPS: SEAD3: New header file sead3-addr.h with hardware addresses. 2015-04-01 17:22:07 +02:00
jazz
jz4740 MIPS: jz4740: Implement read_sched_clock 2015-04-01 17:21:31 +02:00
kernel MIPS: Provide fallback reboot/poweroff/halt implementations 2015-04-01 17:21:58 +02:00
kvm KVM: MIPS: Enable after disabling interrupt 2015-03-02 19:18:12 -03:00
lantiq MIPS: mark prom_free_prom_memory() everywhere with __init 2015-04-01 17:21:58 +02:00
lasat MIPS: Lasat: Add missing CONFIG_PROC_FS dependency to PICVUE_PROC 2014-10-21 17:35:44 +02:00
lib MIPS: csum_partial: Improve instruction parallelism. 2015-04-01 17:22:11 +02:00
loongson MIPS: Loongson-3: remove deprecated IRQF_DISABLED 2015-04-01 17:22:00 +02:00
loongson1 MIPS: Loongson1B: Add a clockevent/clocksource using PWM Timer 2014-11-24 07:45:09 +01:00
math-emu MIPS: Add FPU emulator counter for emulated delay slots. 2015-04-01 17:21:57 +02:00
mm MIPS: DMA: Implement platform hook to perform post-DMA cache flushes. 2015-04-01 17:22:01 +02:00
mti-malta MIPS: Malta: malta-time: Ensure GIC counter is running 2015-03-31 12:04:13 +02:00
mti-sead3 MIPS: SEAD3: Nuke remaining I2C bits. 2015-04-01 17:22:08 +02:00
net module: remove mod arg from module_free, rename module_memfree(). 2015-01-20 11:38:33 +10:30
netlogic MIPS: Netlogic: Add built-in dts for XLP5xx boards 2015-04-01 17:21:54 +02:00
oprofile MIPS: OProfile: Allow sharing IRQ with timer 2015-03-31 12:04:12 +02:00
paravirt mips: Update the email address of Geert Uytterhoeven 2014-06-02 16:34:41 +02:00
pci MIPS: pci: Drop owner assignment from platform_drivers 2015-04-01 17:21:55 +02:00
pistachio MIPS: Add support for the IMG Pistachio SoC 2015-03-31 12:04:12 +02:00
pmcs-msp71xx kconfig: use bool instead of boolean for type definition attributes 2015-01-07 13:08:04 +01:00
pnx833x MIPS: PNX833x: Remove checks for CONFIG_I2C_PNX0105 2014-05-23 15:12:39 +02:00
power MIPS: Hibernate: Restructure files and functions 2015-04-01 17:22:09 +02:00
ralink Driver core patches for 3.19-rc1 2014-12-14 16:10:09 -08:00
rb532 MIPS: Replace use of phys_t with phys_addr_t. 2014-11-24 22:47:31 +01:00
sgi-ip22 MIPS: ip22-gio: Remove legacy suspend/resume support 2015-02-20 13:30:55 +01:00
sgi-ip27 MIPS: sgi-ip27: Implement read_sched_clock 2015-04-01 17:21:29 +02:00
sgi-ip32 MIPS: IP32: Use __noreturn instead of open coded attributes in declarations. 2015-01-13 16:04:28 +01:00
sibyte MIPS: Replace use of phys_t with phys_addr_t. 2014-11-24 22:47:31 +01:00
sni MIPS: Cleanup CP0 PRId and CP1 FPIR register access masks 2013-09-18 20:25:19 +02:00
txx9 Driver core patches for 3.19-rc1 2014-12-14 16:10:09 -08:00
vr41xx MIPS: Idle: Consolidate all declarations in <asm/idle.h>. 2013-05-22 01:34:27 +02:00
Kbuild MIPS: net: Add BPF JIT 2014-05-30 16:10:20 +02:00
Kbuild.platforms MIPS: bcm3384: Rename "bcm3384" target to "bmips" 2015-04-01 17:21:35 +02:00
Kconfig MIPS: Netlogic: Added HugeTLB as default 2015-04-01 17:21:52 +02:00
Kconfig.debug MIPS: kernel: elf: Improve the overall ABI and FPU mode checks 2015-02-17 15:37:39 +00:00
Makefile MIPS: Add dtbs_install target 2015-04-01 17:21:34 +02:00