1
0
Fork 0
Commit Graph

837527 Commits (360985573b556db415daf93e34ce103ec0ee1fe5)

Author SHA1 Message Date
Eric Biggers 360985573b f2fs: separate f2fs i_flags from fs_flags and ext4 i_flags
f2fs copied all the on-disk i_flags from ext4, and along with it the
assumption that the on-disk i_flags are the same as the bits used by
FS_IOC_GETFLAGS and FS_IOC_SETFLAGS.  This is problematic because
reserving an on-disk inode flag in either filesystem's i_flags or in
these ioctls effectively reserves it in all the other places too.  In
fact, most of the "f2fs i_flags" are not used by f2fs at all.

Fix this by separating f2fs's i_flags from the ioctl bits and ext4's
i_flags.

In the process, un-reserve all "f2fs i_flags" that aren't actually
supported by f2fs.  This included various flags that were not settable
at all, as well as various flags that were settable by FS_IOC_SETFLAGS
but didn't actually do anything.

There's a slight chance we'll need to add some flag(s) back to
FS_IOC_SETFLAGS in order to avoid breaking users who expect f2fs to
accept some random flag(s).  But hopefully such users don't exist.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-21 10:41:57 -07:00
Kimberly Brown 176ef3c4de f2fs: replace ktype default_attrs with default_groups
The kobj_type default_attrs field is being replaced by the
default_groups field. Replace the default_attrs fields in f2fs_sb_ktype
and f2fs_feat_ktype with default_groups. Use the ATTRIBUTE_GROUPS macro
to create f2fs_groups and f2fs_feat_groups.

Fixes: fef4129ec2 ("f2fs: fix to be aware discard/preflush/dio command in is_idle()")
Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-21 10:41:56 -07:00
Daniel Rosenberg 4d3aed7090 f2fs: Add option to limit required GC for checkpoint=disable
This extends the checkpoint option to allow checkpoint=disable:%u[%]
This allows you to specify what how much of the disk you are willing
to lose access to while mounting with checkpoint=disable. If the amount
lost would be higher, the mount will return -EAGAIN. This can be given
as a percent of total space, or in blocks.

Currently, we need to run garbage collection until the amount of holes
is smaller than the OVP space. With the new option, f2fs can mark
space as unusable up front instead of requiring garbage collection until
the number of holes is small enough.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-03 13:27:48 -07:00
Daniel Rosenberg a4c3ecaaad f2fs: Fix accounting for unusable blocks
Fixes possible underflows when dealing with unusable blocks.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-03 13:27:47 -07:00
Daniel Rosenberg 9a9aecaad9 f2fs: Fix root reserved on remount
On a remount, you can currently set root reserved if it was not
previously set. This can cause an underflow if reserved has been set to
a very high value, since then root reserved + current reserved could be
greater than user_block_count. inc_valid_block_count later subtracts out
these values from user_block_count, causing an underflow.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-03 13:27:47 -07:00
Daniel Rosenberg ae4ad7ea09 f2fs: Lower threshold for disable_cp_again
The existing threshold for allowable holes at checkpoint=disable time is
too high. The OVP space contains reserved segments, which are always in
the form of free segments. These must be subtracted from the OVP value.

The current threshold is meant to be the maximum value of holes of a
single type we can have and still guarantee that we can fill the disk
without failing to find space for a block of a given type.

If the disk is full, ignoring current reserved, which only helps us,
the amount of unused blocks is equal to the OVP area. Of that, there
are reserved segments, which must be free segments, and the rest of the
ovp area, which can come from either free segments or holes. The maximum
possible amount of holes is OVP-reserved.

Now, consider the disk when mounting with checkpoint=disable.
We must be able to fill all available free space with either data or
node blocks. When we start with checkpoint=disable, holes are locked to
their current type. Say we have H of one type of hole, and H+X of the
other. We can fill H of that space with arbitrary typed blocks via SSR.
For the remaining H+X blocks, we may not have any of a given block type
left at all. For instance, if we were to fill the disk entirely with
blocks of the type with fewer holes, the H+X blocks of the opposite type
would not be used. If H+X > OVP-reserved, there would be more holes than
could possibly exist, and we would have failed to find a suitable block
earlier on, leading to a crash in update_sit_entry.

If H+X <= OVP-reserved, then the holes end up effectively masked by the OVP
region in this case.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-03 13:27:47 -07:00
Chao Yu 36af5f407b f2fs: fix sparse warning
make C=2 CHECKFLAGS="-D__CHECK_ENDIAN__"

CHECK   dir.c
dir.c:842:50: warning: cast from restricted __le32
CHECK   node.c
node.c:2759:40: warning: restricted __le32 degrades to integer

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-30 09:48:45 -07:00
Sahitya Tummala 81621f9761 f2fs: fix f2fs_show_options to show nodiscard mount option
Fix f2fs_show_options to show nodiscard mount option.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-30 09:13:54 -07:00
Sahitya Tummala 9227d5227b f2fs: add error prints for debugging mount failure
Add error prints to get more details on the mount failure.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-30 09:13:54 -07:00
Chao Yu c854f4d681 f2fs: fix to do sanity check on segment bitmap of LFS curseg
As Jungyeon Reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=203233

- Reproduces
gcc poc_13.c
./run.sh f2fs

- Kernel messages
 F2FS-fs (sdb): Bitmap was wrongly set, blk:4608
 kernel BUG at fs/f2fs/segment.c:2133!
 RIP: 0010:update_sit_entry+0x35d/0x3e0
 Call Trace:
  f2fs_allocate_data_block+0x16c/0x5a0
  do_write_page+0x57/0x100
  f2fs_do_write_node_page+0x33/0xa0
  __write_node_page+0x270/0x4e0
  f2fs_sync_node_pages+0x5df/0x670
  f2fs_write_checkpoint+0x364/0x13a0
  f2fs_sync_fs+0xa3/0x130
  f2fs_do_sync_file+0x1a6/0x810
  do_fsync+0x33/0x60
  __x64_sys_fsync+0xb/0x10
  do_syscall_64+0x43/0x110
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

The testcase fails because that, in fuzzed image, current segment was
allocated with LFS type, its .next_blkoff should point to an unused
block address, but actually, its bitmap shows it's not. So during
allocation, f2fs crash when setting bitmap.

Introducing sanity_check_curseg() to check such inconsistence of
current in-used segment.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-30 09:13:49 -07:00
Jaegeuk Kim 4d11d13e27 f2fs: add missing sysfs entries in documentation
This patch cleans up documentation to cover missing sysfs entries.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-30 09:13:41 -07:00
Chao Yu 040d2bb318 f2fs: fix to avoid deadloop if data_flush is on
As Hagbard Celine reported:

[  615.697824] INFO: task kworker/u16:5:344 blocked for more than 120 seconds.
[  615.697825]       Not tainted 5.0.15-gentoo-f2fslog #4
[  615.697826] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  615.697827] kworker/u16:5   D    0   344      2 0x80000000
[  615.697831] Workqueue: writeback wb_workfn (flush-259:0)
[  615.697832] Call Trace:
[  615.697836]  ? __schedule+0x2c5/0x8b0
[  615.697839]  schedule+0x32/0x80
[  615.697841]  schedule_preempt_disabled+0x14/0x20
[  615.697842]  __mutex_lock.isra.8+0x2ba/0x4d0
[  615.697845]  ? log_store+0xf5/0x260
[  615.697848]  f2fs_write_data_pages+0x133/0x320
[  615.697851]  ? trace_hardirqs_on+0x2c/0xe0
[  615.697854]  do_writepages+0x41/0xd0
[  615.697857]  __filemap_fdatawrite_range+0x81/0xb0
[  615.697859]  f2fs_sync_dirty_inodes+0x1dd/0x200
[  615.697861]  f2fs_balance_fs_bg+0x2a7/0x2c0
[  615.697863]  ? up_read+0x5/0x20
[  615.697865]  ? f2fs_do_write_data_page+0x2cb/0x940
[  615.697867]  f2fs_balance_fs+0xe5/0x2c0
[  615.697869]  __write_data_page+0x1c8/0x6e0
[  615.697873]  f2fs_write_cache_pages+0x1e0/0x450
[  615.697878]  f2fs_write_data_pages+0x14b/0x320
[  615.697880]  ? trace_hardirqs_on+0x2c/0xe0
[  615.697883]  do_writepages+0x41/0xd0
[  615.697885]  __filemap_fdatawrite_range+0x81/0xb0
[  615.697887]  f2fs_sync_dirty_inodes+0x1dd/0x200
[  615.697889]  f2fs_balance_fs_bg+0x2a7/0x2c0
[  615.697891]  f2fs_write_node_pages+0x51/0x220
[  615.697894]  do_writepages+0x41/0xd0
[  615.697897]  __writeback_single_inode+0x3d/0x3d0
[  615.697899]  writeback_sb_inodes+0x1e8/0x410
[  615.697902]  __writeback_inodes_wb+0x5d/0xb0
[  615.697904]  wb_writeback+0x28f/0x340
[  615.697906]  ? cpumask_next+0x16/0x20
[  615.697908]  wb_workfn+0x33e/0x420
[  615.697911]  process_one_work+0x1a1/0x3d0
[  615.697913]  worker_thread+0x30/0x380
[  615.697915]  ? process_one_work+0x3d0/0x3d0
[  615.697916]  kthread+0x116/0x130
[  615.697918]  ? kthread_create_worker_on_cpu+0x70/0x70
[  615.697921]  ret_from_fork+0x3a/0x50

There is still deadloop in below condition:

d A
- do_writepages
 - f2fs_write_node_pages
  - f2fs_balance_fs_bg
   - f2fs_sync_dirty_inodes
    - f2fs_write_cache_pages
     - mutex_lock(&sbi->writepages)	-- lock once
     - __write_data_page
      - f2fs_balance_fs_bg
       - f2fs_sync_dirty_inodes
        - f2fs_write_data_pages
         - mutex_lock(&sbi->writepages)	-- lock again

Thread A			Thread B
- do_writepages
 - f2fs_write_node_pages
  - f2fs_balance_fs_bg
   - f2fs_sync_dirty_inodes
    - .cp_task = current
				- f2fs_sync_dirty_inodes
				 - .cp_task = current
				 - filemap_fdatawrite
				 - .cp_task = NULL
    - filemap_fdatawrite
     - f2fs_write_cache_pages
      - enter f2fs_balance_fs_bg since .cp_task is NULL
    - .cp_task = NULL

Change as below to avoid this:
- add condition to avoid holding .writepages mutex lock in path
of data flush
- introduce mutex lock sbi.flush_lock to exclude concurrent data
flush in background.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-23 07:03:19 -07:00
Park Ju Hyung f7dfd9f361 f2fs: always assume that the device is idle under gc_urgent
This allows more aggressive discards and balancing job to be done
under gc_urgent.

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-23 07:03:19 -07:00
Chao Yu 8648de2c58 f2fs: add bio cache for IPU
SQLite in Wal mode may trigger sequential IPU write in db-wal file, after
commit d1b3e72d54 ("f2fs: submit bio of in-place-update pages"), we
lost the chance of merging page in inner managed bio cache, result in
submitting more small-sized IO.

So let's add temporary bio in writepages() to cache mergeable write IO as
much as possible.

Test case:
1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 65536" -c "fsync"
2. xfs_io -f /mnt/f2fs/file -c "pwrite 0 65536" -c "fsync"

Before:
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65544, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65552, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65560, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65568, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65576, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65584, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65592, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65600, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65608, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65616, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65624, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65632, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65640, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65648, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65656, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65664, size = 4096
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), NODE, sector = 57352, size = 4096

After:
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), DATA, sector = 65544, size = 65536
f2fs_submit_write_bio: dev = (251,0)/(251,0), rw = WRITE(S), NODE, sector = 57368, size = 4096

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-23 07:03:19 -07:00
Jaegeuk Kim 49dd883c42 f2fs: allow ssr block allocation during checkpoint=disable period
This patch allows to use ssr during checkpoint is disabled.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-23 07:03:18 -07:00
Chao Yu 5dae2d3907 f2fs: fix to check layout on last valid checkpoint park
As Ju Hyung reported:

"
I was semi-forced today to use the new kernel and test f2fs.

My Ubuntu initramfs got a bit wonky and I had to boot into live CD and
fix some stuffs. The live CD was using 4.15 kernel, and just mounting
the f2fs partition there corrupted f2fs and my 4.19(with 5.1-rc1-4.19
f2fs-stable merged) refused to mount with "SIT is corrupted node"
message.

I used the latest f2fs-tools sent by Chao including "fsck.f2fs: fix to
repair cp_loads blocks at correct position"

It spit out 140M worth of output, but at least I didn't have to run it
twice. Everything returned "Ok" in the 2nd run.
The new log is at
http://arter97.com/f2fs/final

After fixing the image, I used my 4.19 kernel with 5.2-rc1-4.19
f2fs-stable merged and it mounted.

But, I got this:
[    1.047791] F2FS-fs (nvme0n1p3): layout of large_nat_bitmap is
deprecated, run fsck to repair, chksum_offset: 4092
[    1.081307] F2FS-fs (nvme0n1p3): Found nat_bits in checkpoint
[    1.161520] F2FS-fs (nvme0n1p3): recover fsync data on readonly fs
[    1.162418] F2FS-fs (nvme0n1p3): Mounted with checkpoint version = 761c7e00

But after doing a reboot, the message is gone:
[    1.098423] F2FS-fs (nvme0n1p3): Found nat_bits in checkpoint
[    1.177771] F2FS-fs (nvme0n1p3): recover fsync data on readonly fs
[    1.178365] F2FS-fs (nvme0n1p3): Mounted with checkpoint version = 761c7eda

I'm not exactly sure why the kernel detected that I'm still using the
old layout on the first boot. Maybe fsck didn't fix it properly, or
the check from the kernel is improper.
"

Although we have rebuild the old deprecated checkpoint with new layout
during repair, we only repair last checkpoint park, the other old one is
remained.

Once the image was mounted, we will 1) sanity check layout and 2) decide
which checkpoint park to use according to cp_ver. So that we will print
reported message unnecessarily at step 1), to avoid it, we simply move
layout check into f2fs_sanity_check_ckpt() after step 2).

Reported-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-23 07:03:18 -07:00
Jaegeuk Kim bc88ac96a9 f2fs: link f2fs quota ops for sysfile
This patch reverts:
commit fb40d618b0 ("f2fs: don't clear CP_QUOTA_NEED_FSCK_FLAG").

We were missing error handlers used in f2fs quota ops.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-05-23 07:03:11 -07:00
Linus Torvalds e0654264c4 - Fix-ups
- Remove unused BACKLIGHT_LCD_SUPPORT symbol; Kconfig
    - Remove unused BACKLIGHT_CLASS_DEVICE dependencies; Kconfig
    - Add DT support; lm3630a_bl
 
  - Bug Fixes
    - Fix error path issues; lm3630a_bl
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAlzamtsACgkQUa+KL4f8
 d2GIww//V+VGBKYrAzpZW2SHvqMyE/wGle/wJGoyhGrTJ5FQ68DJJK1mzkq/DQMG
 ayWTVIpdjZMCiFeuel+DFpa4qSwoYydqtCAtKeey5XLB/BDFRmx9ysJVfAcrmrQg
 NDvWhc+mEccGLMwndX1p+QGboSOjwN5hc1FSnXww6XA+pnTNvenQunDOnp6v/cUI
 YNJssdHdzjZfApnwG9dEIguuD22Jp6APJjfinkcsp2UR1bDymdpkSMn0d/89RR7I
 T0RjFF0Lexj4dd6IE6WHbCXeQKZq48meIH3aNF5i5nx8QibFg/Pd/3gcnQYL/l/o
 JUFy8tmR15DCWjPY411b+A8sIsxO5xt3L3WNtp6YZdwAMAl/6LEXHMoAWwvwXVty
 k3fxe3C/ansRe1KXABWlRGyrOn4qJ9D3c+3cauUdFqcYdzuCPox3nJUciOExk0y5
 QIjS6jDMTk4r2vzWxJMMWhslYDa460oiTDnDemt7s9MlpbAwmjDpJarXhnSjfYQN
 U5EhuLczGyMZS0VCUYJhQQF0BnzvMB8aIzzcuy0BMRPJXnwfmZ8/Wc3HND2KislP
 E1rKtzmYGs1wbH/IuMBH26Bpz7iutnaY0RqeBTQ2sK6G4yGYBqkrI2nLli0dFEEf
 zMITeSA8s/ll9WWvmUUnHUBzwh56XgX1uGVbkgYxC3co7yFq/yg=
 =QRgB
 -----END PGP SIGNATURE-----

Merge tag 'backlight-next-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
 "Fix-ups:
   - Remove unused BACKLIGHT_LCD_SUPPORT symbol
   - Remove unused BACKLIGHT_CLASS_DEVICE dependencies
   - Add DT support to lm3630a_bl

  Bug Fixes:
   - Fix error path issues in lm3630a_bl"

* tag 'backlight-next-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: lm3630a: Add firmware node support
  dt-bindings: backlight: Add lm3630a bindings
  backlight: lm3630a: Return 0 on success in update_status functions
  video: lcd: Remove useless BACKLIGHT_CLASS_DEVICE dependencies
  video: backlight: Remove useless BACKLIGHT_LCD_SUPPORT kernel symbol
2019-05-14 10:45:03 -07:00
Linus Torvalds ebcf5bb282 - Core Frameworks
- Document (kerneldoc) core mfd_add_devices() API
 
  - New Drivers
    - Add support for Altera SOCFPGA System Manager
    - Add support for Maxim MAX77650/77651 PMIC
    - Add support for Maxim MAX77663 PMIC
    - Add support for ST Multi-Function eXpander (STMFX)
 
  - New Device Support
    - Add support for LEDs to Intel Cherry Trail Whiskey Cove PMIC
    - Add support for RTC to SAMSUNG Electronics S2MPA01 PMIC
    - Add support for SAM9X60 to Atmel HLCDC (High-end LCD Controller)
    - Add support for USB X-Powers AXP 8xx PMICs
    - Add support for Integrated Sensor Hub (ISH) to ChromeOS EC
    - Add support for USB PD Logger to ChromeOS EC
    - Add support for AXP223 to X-Powers AXP series PMICs
    - Add support for Power Supply to X-Powers AXP 803 PMICs
    - Add support for Comet Lake to Intel Low Power Subsystem
    - Add support for Fingerprint MCU to ChromeOS EC
    - Add support for Touchpad MCU to ChromeOS EC
    - Move TI LM3532 support to LED
 
  - New Functionality
    - Add/extend DT support; max77650, max77620
    - Add support for power-off; max77620
    - Add support for clocking; syscon
    - Add support for host sleep event; cros_ec
 
  - Fix-ups
    - Trivial; Formatting, spelling, etc; Kconfig, sec-core, ab8500-debugfs
    - Remove unused functionality; rk808, da9063-*
    - SPDX conversion; da9063-*, atmel-*,
    - Adapt/add new register definitions; cs47l35-tables, cs47l90-tables, imx6q-iomuxc-gpr
    - Fix-up DT bindings; ti-lmu, cirrus,lochnagar
    - Simply obtaining driver data; ssbi, t7l66xb, tc6387xb, tc6393xb
 
  - Bug Fixes
    - Fix incorrect defined values; max77620, da9063
    - Fix device initialisation; twl6040
    - Reset device on init; intel-lpss
    - Fix build warnings when !OF; sun6i-prcm
    - Register OF match tables; tps65912-spi
    - Fix DMI matching; intel_quark_i2c_gpio
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAlzame0ACgkQUa+KL4f8
 d2GbBQ//bUoA+hcTo/ZUyQQGmE8axikZ6pacY+Y41pdzzLFoOM3IIz4NpdUF0fP2
 6r11zDiM2cL9CuMJl/AMiBv7fifowYykaBUEkkm8n2Cpj/bpLIm8eQy6jf14kqNR
 gj9sTy/feBcnZhqLLx9x9W9695nRTE4q3g+mDOj5sXRvZxqcPBaNgWkk5a8vtN9V
 yH2XkQSoK0EvvNWjl3pshp7HdKhX8k1xDZ2ghOi3Yk9JmFlg+wrWEKE4KQ7dDoUa
 SFXFReIwyleAw4Bc/demT1tSDiNgIPc9ZHtb67dUmDCQgpQqTK/h6WV1JeW1I0vh
 AM6n2hnogcbVcJdAHtwS5tR6nVahpUQ1V+XhYDyyHNmx6rqW5q2e3xRF75CT4wBZ
 NMIVaWNlih62Y196Exy+6CANHvJyxL6yRgvXkpfyaf9vYdXUrBRUujxn1PzrbkNJ
 kJwvZk5yHgg0n5SIV/D4CVy+RHP6uqe4oE4iXNWP5Um06OyVCieqMvoduyGQdLG/
 7Xrflc4EmeqTfWZrnW3ljh6sOBC+MQCfIKgRtvkPQ5EpcNU2VPXeNsAvIIHCpWHy
 HJY43WRP98DTNyP+/oBrsh56y8n+NwMBcWSmL4tv4cKmGx11bRvp35Mzy1ElPw6Y
 Zzttsw8Puz2EMmfGdcRwkZW0KWb5sAvJcImCkrjg/13QPHgcPgk=
 =dTSD
 -----END PGP SIGNATURE-----

Merge tag 'mfd-next-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "Core Framework:
   - Document (kerneldoc) core mfd_add_devices() API

  New Drivers:
   - Altera SOCFPGA System Manager
   - Maxim MAX77650/77651 PMIC
   - Maxim MAX77663 PMIC
   - ST Multi-Function eXpander (STMFX)

  New Device Support:
   - LEDs support in Intel Cherry Trail Whiskey Cove PMIC
   - RTC support in SAMSUNG Electronics S2MPA01 PMIC
   - SAM9X60 support in Atmel HLCDC (High-end LCD Controller)
   - USB X-Powers AXP 8xx PMICs
   - Integrated Sensor Hub (ISH) in ChromeOS EC
   - USB PD Logger in ChromeOS EC
   - AXP223 in X-Powers AXP series PMICs
   - Power Supply in X-Powers AXP 803 PMICs
   - Comet Lake in Intel Low Power Subsystem
   - Fingerprint MCU in ChromeOS EC
   - Touchpad MCU in ChromeOS EC
   - Move TI LM3532 support to LED

  New Functionality:
   - max77650, max77620: Add/extend DT support
   - max77620 power-off
   - syscon clocking
   - croc_ec host sleep event

  Fix-ups:
   - Trivial; Formatting, spelling, etc; Kconfig, sec-core, ab8500-debugfs
   - Remove unused functionality; rk808, da9063-*
   - SPDX conversion; da9063-*, atmel-*,
   - Adapt/add new register definitions; cs47l35-tables, cs47l90-tables, imx6q-iomuxc-gpr
   - Fix-up DT bindings; ti-lmu, cirrus,lochnagar
   - Simply obtaining driver data; ssbi, t7l66xb, tc6387xb, tc6393xb

  Bug Fixes:
   - Fix incorrect defined values; max77620, da9063
   - Fix device initialisation; twl6040
   - Reset device on init; intel-lpss
   - Fix build warnings when !OF; sun6i-prcm
   - Register OF match tables; tps65912-spi
   - Fix DMI matching; intel_quark_i2c_gpio"

* tag 'mfd-next-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (65 commits)
  mfd: Use dev_get_drvdata() directly
  mfd: cros_ec: Instantiate properly CrOS Touchpad MCU device
  mfd: cros_ec: Instantiate properly CrOS FP MCU device
  mfd: cros_ec: Update the EC feature codes
  mfd: intel-lpss: Add Intel Comet Lake PCI IDs
  mfd: lochnagar: Add links to binding docs for sound and hwmon
  mfd: ab8500-debugfs: Fix a typo ("deubgfs")
  mfd: imx6sx: Add MQS register definition for iomuxc gpr
  dt-bindings: mfd: LMU: Fix lm3632 dt binding example
  mfd: intel_quark_i2c_gpio: Adjust IOT2000 matching
  mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L
  mfd: tps65912-spi: Add missing of table registration
  mfd: axp20x: Add USB power supply mfd cell to AXP803
  mfd: sun6i-prcm: Fix build warning for non-OF configurations
  mfd: intel-lpss: Set the device in reset state when init
  platform/chrome: Add support for v1 of host sleep event
  mfd: cros_ec: Add host_sleep_event_v1 command
  mfd: cros_ec: Instantiate the CrOS USB PD logger driver
  mfd: cs47l90: Make DAC_AEC_CONTROL_2 readable
  mfd: cs47l35: Make DAC_AEC_CONTROL_2 readable
  ...
2019-05-14 10:39:08 -07:00
Linus Torvalds 414147d99b pci-v5.2-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlzZ/4MUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwmYw/+Mzkkz/zOpzYdsYyy6Xv3qRdn92Kp
 bePOPACdwpUK+HV4qE6EEYBcVZdkO7NMkshA7wIb4VlsE0sVHSPvlybUmTUGWeFd
 CG87YytVOo4K7cAeKdGVwGaoQSeaZX3wmXVGGQtm/T4b63GdBjlNJ/MBuPWDDMlM
 XEis29MTH6xAu3MbT7pp5q+snSzOmt0RWuVpX/U1YcZdhu8fbwfOxj9Jx6slh4+2
 MvseYNNrTRJrMF0o5o83Khx3tAcW8OTTnDJ9+BCrAlE1PId1s/KjzY6nqReBtom9
 CIqtwAlx/wGkRBRgfsmEtFBhkDA05PPilhSy6k2LP8B4A3qBqir1Pd+5bhHG4FIu
 nPPCZjZs2+0DNrZwQv59qIlWsqDFm214WRln9Z7d/VNtrLs2UknVghjQcHv7rP+K
 /NKfPlAuHTI/AFi9pIPFWTMx5J4iXX1hX4LiptE9M0k9/vSiaCVnTS3QbFvp3py3
 VTT9sprzfV4JX4aqS/rbQc/9g4k9OXPW9viOuWf5rYZJTBbsu6PehjUIRECyFaO+
 0gDqE8WsQOtNNX7e5q2HJ/HpPQ+Q1IIlReC+1H56T/EQZmSIBwhTLttQMREL/8af
 Lka3/1SVUi4WG6SBrBI75ClsR91UzE6AK+h9fAyDuR6XJkbysWjkyG6Lmy617g6w
 lb+fQwOzUt4eGDo=
 =4Vc+
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration changes:

   - Add _HPX Type 3 settings support, which gives firmware more
     influence over device configuration (Alexandru Gagniuc)

   - Support fixed bus numbers from bridge Enhanced Allocation
     capabilities (Subbaraya Sundeep)

   - Add "external-facing" DT property to identify cases where we
     require IOMMU protection against untrusted devices (Jean-Philippe
     Brucker)

   - Enable PCIe services for host controller drivers that use managed
     host bridge alloc (Jean-Philippe Brucker)

   - Log PCIe port service messages with pci_dev, not the pcie_device
     (Frederick Lawler)

   - Convert pciehp from pciehp_debug module parameter to generic
     dynamic debug (Frederick Lawler)

  Peer-to-peer DMA:

   - Add whitelist of Root Complexes that support peer-to-peer DMA
     between Root Ports (Christian König)

  Native controller drivers:

   - Add PCI host bridge DMA ranges for bridges that can't DMA
     everywhere, e.g., iProc (Srinath Mannam)

   - Add Amazon Annapurna Labs PCIe host controller driver (Jonathan
     Chocron)

   - Fix Tegra MSI target allocation so DMA doesn't generate unwanted
     MSIs (Vidya Sagar)

   - Fix of_node reference leaks (Wen Yang)

   - Fix Hyper-V module unload & device removal issues (Dexuan Cui)

   - Cleanup R-Car driver (Marek Vasut)

   - Cleanup Keystone driver (Kishon Vijay Abraham I)

   - Cleanup i.MX6 driver (Andrey Smirnov)

  Significant bug fixes:

   - Reset Lenovo ThinkPad P50 GPU so nouveau works after reboot (Lyude
     Paul)

   - Fix Switchtec firmware update performance issue (Wesley Sheng)

   - Work around Pericom switch link retraining erratum (Stefan Mätje)"

* tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (141 commits)
  MAINTAINERS: Add Karthikeyan Mitran and Hou Zhiqiang for Mobiveil PCI
  PCI: pciehp: Remove pointless MY_NAME definition
  PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition
  PCI: pciehp: Remove unused dbg/err/info/warn() wrappers
  PCI: pciehp: Log messages with pci_dev, not pcie_device
  PCI: pciehp: Replace pciehp_debug module param with dyndbg
  PCI: pciehp: Remove pciehp_debug uses
  PCI/AER: Log messages with pci_dev, not pcie_device
  PCI/DPC: Log messages with pci_dev, not pcie_device
  PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI/AER: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI: Replace dev_printk(KERN_DEBUG) with dev_info(), etc
  PCI: Replace printk(KERN_INFO) with pr_info(), etc
  PCI: Use dev_printk() when possible
  PCI: Cleanup setup-bus.c comments and whitespace
  PCI: imx6: Allow asynchronous probing
  PCI: dwc: Save root bus for driver remove hooks
  PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code
  PCI: dwc: Free MSI in dw_pcie_host_init() error path
  PCI: dwc: Free MSI IRQ page in dw_pcie_free_msi()
  ...
2019-05-14 10:30:10 -07:00
Linus Torvalds 318222a35b Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - a few misc things and hotfixes

 - ocfs2

 - almost all of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (139 commits)
  kernel/memremap.c: remove the unused device_private_entry_fault() export
  mm: delete find_get_entries_tag
  mm/huge_memory.c: make __thp_get_unmapped_area static
  mm/mprotect.c: fix compilation warning because of unused 'mm' variable
  mm/page-writeback: introduce tracepoint for wait_on_page_writeback()
  mm/vmscan: simplify trace_reclaim_flags and trace_shrink_flags
  mm/Kconfig: update "Memory Model" help text
  mm/vmscan.c: don't disable irq again when count pgrefill for memcg
  mm: memblock: make keeping memblock memory opt-in rather than opt-out
  hugetlbfs: always use address space in inode for resv_map pointer
  mm/z3fold.c: support page migration
  mm/z3fold.c: add structure for buddy handles
  mm/z3fold.c: improve compression by extending search
  mm/z3fold.c: introduce helper functions
  mm/page_alloc.c: remove unnecessary parameter in rmqueue_pcplist
  mm/hmm: add ARCH_HAS_HMM_MIRROR ARCH_HAS_HMM_DEVICE Kconfig
  mm/vmscan.c: simplify shrink_inactive_list()
  fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback
  xen/privcmd-buf.c: convert to use vm_map_pages_zero()
  xen/gntdev.c: convert to use vm_map_pages()
  ...
2019-05-14 10:10:55 -07:00
Christoph Hellwig 640be2d1ff kernel/memremap.c: remove the unused device_private_entry_fault() export
This export has been entirely unused since it was added more than 1 1/2
years ago.

Link: http://lkml.kernel.org/r/20190429115535.12793-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:51 -07:00
Matthew Wilcox (Oracle) a1b8e6abf3 mm: delete find_get_entries_tag
I removed the only user of this and hadn't noticed it was now unused.

Link: http://lkml.kernel.org/r/20190430152929.21813-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:51 -07:00
Bharath Vedartham b3b07077b0 mm/huge_memory.c: make __thp_get_unmapped_area static
__thp_get_unmapped_area is only used in mm/huge_memory.c.  Make it static.
Tested by building and booting the kernel.

Link: http://lkml.kernel.org/r/20190504102353.GA22525@bharath12345-Inspiron-5559
Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:51 -07:00
Mike Rapoport 94393c7896 mm/mprotect.c: fix compilation warning because of unused 'mm' variable
Since 0cbe3e26ab ("mm: update ptep_modify_prot_start/commit to take
vm_area_struct as arg") the only place that uses the local 'mm' variable
in change_pte_range() is the call to set_pte_at().

Many architectures define set_pte_at() as macro that does not use the 'mm'
parameter, which generates the following compilation warning:

 CC      mm/mprotect.o
mm/mprotect.c: In function 'change_pte_range':
mm/mprotect.c:42:20: warning: unused variable 'mm' [-Wunused-variable]
  struct mm_struct *mm = vma->vm_mm;
                    ^~

Fix it by passing vma->mm to set_pte_at() and dropping the local 'mm'
variable in change_pte_range().

[liu.song.a23@gmail.com: fix missed conversions]
  Link: http://lkml.kernel.org/r/CAPhsuW6wcQgYLHNdBdw6m0YiR4RWsS4XzfpSKU7wBLLeOCTbpw@mail.gmail.comLink: http://lkml.kernel.org/r/1557305432-4940-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Song Liu <liu.song.a23@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:51 -07:00
Yafang Shao 19343b5bdd mm/page-writeback: introduce tracepoint for wait_on_page_writeback()
Recently there have been some hung tasks on our server due to
wait_on_page_writeback(), and we want to know the details of this
PG_writeback, i.e.  this page is writing back to which device.  But it is
not so convenient to get the details.

I think it would be better to introduce a tracepoint for diagnosing the
writeback details.

Link: http://lkml.kernel.org/r/1556274402-19018-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:51 -07:00
Yafang Shao 60b62ff7cc mm/vmscan: simplify trace_reclaim_flags and trace_shrink_flags
trace_reclaim_flags and trace_shrink_flags are almost the same.
We can simplify them to avoid redundant code.

Link: http://lkml.kernel.org/r/1556169203-5858-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:51 -07:00
Mike Rapoport d66d109d3c mm/Kconfig: update "Memory Model" help text
The help describing the memory model selection is outdated.  It still says
that SPARSEMEM is experimental and DISCONTIGMEM is a preferred over
SPARSEMEM.

Update the help text for the relevant options:
* add a generic help for the "Memory Model" prompt
* add description for FLATMEM
* reduce the description of DISCONTIGMEM and add a deprecation note
* prefer SPARSEMEM over DISCONTIGMEM

Link: http://lkml.kernel.org/r/1556188531-20728-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:51 -07:00
Yafang Shao 2fa2690ca6 mm/vmscan.c: don't disable irq again when count pgrefill for memcg
We can use __count_memcg_events() directly because this callsite is alreay
protected by spin_lock_irq().

Link: http://lkml.kernel.org/r/1556093494-30798-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:51 -07:00
Mike Rapoport 350e88bad4 mm: memblock: make keeping memblock memory opt-in rather than opt-out
Most architectures do not need the memblock memory after the page
allocator is initialized, but only few enable ARCH_DISCARD_MEMBLOCK in the
arch Kconfig.

Replacing ARCH_DISCARD_MEMBLOCK with ARCH_KEEP_MEMBLOCK and inverting the
logic makes it clear which architectures actually use memblock after
system initialization and skips the necessity to add ARCH_DISCARD_MEMBLOCK
to the architectures that are still missing that option.

Link: http://lkml.kernel.org/r/1556102150-32517-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Mike Kravetz f27a5136f7 hugetlbfs: always use address space in inode for resv_map pointer
Continuing discussion about 58b6e5e8f1 ("hugetlbfs: fix memory leak for
resv_map") brought up the issue that inode->i_mapping may not point to the
address space embedded within the inode at inode eviction time.  The
hugetlbfs truncate routine handles this by explicitly using inode->i_data.
However, code cleaning up the resv_map will still use the address space
pointed to by inode->i_mapping.  Luckily, private_data is NULL for address
spaces in all such cases today but, there is no guarantee this will
continue.

Change all hugetlbfs code getting a resv_map pointer to explicitly get it
from the address space embedded within the inode.  In addition, add more
comments in the code to indicate why this is being done.

Link: http://lkml.kernel.org/r/20190419204435.16984-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Yufen Yu <yuyufen@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Vitaly Wool 1f862989b0 mm/z3fold.c: support page migration
Now that we are not using page address in handles directly, we can make
z3fold pages movable to decrease the memory fragmentation z3fold may
create over time.

This patch starts advertising non-headless z3fold pages as movable and
uses the existing kernel infrastructure to implement moving of such pages
per memory management subsystem's request.  It thus implements 3 required
callbacks for page migration:

* isolation callback: z3fold_page_isolate(): try to isolate the page by
  removing it from all lists.  Pages scheduled for some activity and
  mapped pages will not be isolated.  Return true if isolation was
  successful or false otherwise

* migration callback: z3fold_page_migrate(): re-check critical
  conditions and migrate page contents to the new page provided by the
  memory subsystem.  Returns 0 on success or negative error code otherwise

* putback callback: z3fold_page_putback(): put back the page if
  z3fold_page_migrate() for it failed permanently (i.  e.  not with
  -EAGAIN code).

[lkp@intel.com: z3fold_page_isolate() can be static]
  Link: http://lkml.kernel.org/r/20190419130924.GA161478@ivb42
Link: http://lkml.kernel.org/r/20190417103922.31253da5c366c4ebe0419cfc@gmail.com
Signed-off-by: Vitaly Wool <vitaly.vul@sony.com>
Signed-off-by: kbuild test robot <lkp@intel.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Vitaly Wool 7c2b8baa61 mm/z3fold.c: add structure for buddy handles
For z3fold to be able to move its pages per request of the memory
subsystem, it should not use direct object addresses in handles.  Instead,
it will create abstract handles (3 per page) which will contain pointers
to z3fold objects.  Thus, it will be possible to change these pointers
when z3fold page is moved.

Link: http://lkml.kernel.org/r/20190417103826.484eaf18c1294d682769880f@gmail.com
Signed-off-by: Vitaly Wool <vitaly.vul@sony.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Vitaly Wool 351618b203 mm/z3fold.c: improve compression by extending search
The current z3fold implementation only searches this CPU's page lists for
a fitting page to put a new object into.  This patch adds quick search for
very well fitting pages (i.  e.  those having exactly the required number
of free space) on other CPUs too, before allocating a new page for that
object.

Link: http://lkml.kernel.org/r/20190417103733.72ae81abe1552397c95a008e@gmail.com
Signed-off-by: Vitaly Wool <vitaly.vul@sony.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Vitaly Wool 9050cce104 mm/z3fold.c: introduce helper functions
Patch series "z3fold: support page migration", v2.

This patchset implements page migration support and slightly better buddy
search.  To implement page migration support, z3fold has to move away from
the current scheme of handle encoding.  i.  e.  stop encoding page address
in handles.  Instead, a small per-page structure is created which will
contain actual addresses for z3fold objects, while pointers to fields of
that structure will be used as handles.

Thus, it will be possible to change the underlying addresses to reflect
page migration.

To support migration itself, 3 callbacks will be implemented:

1: isolation callback: z3fold_page_isolate(): try to isolate the page
   by removing it from all lists.  Pages scheduled for some activity and
   mapped pages will not be isolated.  Return true if isolation was
   successful or false otherwise

2: migration callback: z3fold_page_migrate(): re-check critical
   conditions and migrate page contents to the new page provided by the
   system.  Returns 0 on success or negative error code otherwise

3: putback callback: z3fold_page_putback(): put back the page if
   z3fold_page_migrate() for it failed permanently (i.  e.  not with
   -EAGAIN code).

To make sure an isolated page doesn't get freed, its kref is incremented
in z3fold_page_isolate() and decremented during post-migration compaction,
if migration was successful, or by z3fold_page_putback() in the other
case.

Since the new handle encoding scheme implies slight memory consumption
increase, better buddy search (which decreases memory consumption) is
included in this patchset.

This patch (of 4):

Introduce a separate helper function for object allocation, as well as 2
smaller helpers to add a buddy to the list and to get a pointer to the
pool from the z3fold header.  No functional changes here.

Link: http://lkml.kernel.org/r/20190417103633.a4bb770b5bf0fb7e43ce1666@gmail.com
Signed-off-by: Vitaly Wool <vitaly.vul@sony.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Yafang Shao 1c52e6d068 mm/page_alloc.c: remove unnecessary parameter in rmqueue_pcplist
Because rmqueue_pcplist() is only called when order is 0, we don't need to
use order as a parameter.

Link: http://lkml.kernel.org/r/1555591709-11744-1-git-send-email-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Jérôme Glisse 2c8fc3dcf2 mm/hmm: add ARCH_HAS_HMM_MIRROR ARCH_HAS_HMM_DEVICE Kconfig
Add 2 new Kconfig variables that are not used by anyone.  I check that
various make ARCH=somearch allmodconfig do work and do not complain.  This
new Kconfig needs to be added first so that device drivers that depend on
HMM can be updated.

Once drivers are updated then I can update the HMM Kconfig to depend on
this new Kconfig in a followup patch.

This is about solving Kconfig for HMM given that device driver are
going through their own tree we want to avoid changing them from the mm
tree.  So plan is:

1 - Kernel release N add the new Kconfig to mm/Kconfig (this patch)
2 - Kernel release N+1 update driver to depend on new Kconfig ie
    stop using ARCH_HASH_HMM and start using ARCH_HAS_HMM_MIRROR
    and ARCH_HAS_HMM_DEVICE (one or the other or both depending
    on the driver)
3 - Kernel release N+2 remove ARCH_HASH_HMM and do final Kconfig
    update in mm/Kconfig

Link: http://lkml.kernel.org/r/20190417211141.17580-1-jglisse@redhat.com
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Kirill Tkhai f46b79120e mm/vmscan.c: simplify shrink_inactive_list()
This merges together duplicated patterns of code.  Also, replace
count_memcg_events() with its irq-careless namesake, because they are
already called in interrupts disabled context.

Link: http://lkml.kernel.org/r/2ece1df4-2989-bc9b-6172-61e9fdde5bfd@virtuozzo.com
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Amir Goldstein c553ea4fdf fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback
23d0127096 ("fs/sync.c: make sync_file_range(2) use WB_SYNC_NONE
writeback") claims that sync_file_range(2) syscall was "created for
userspace to be able to issue background writeout and so waiting for
in-flight IO is undesirable there" and changes the writeback (back) to
WB_SYNC_NONE.

This claim is only partially true.  It is true for users that use the flag
SYNC_FILE_RANGE_WRITE by itself, as does PostgreSQL, the user that was the
reason for changing to WB_SYNC_NONE writeback.

However, that claim is not true for users that use that flag combination
SYNC_FILE_RANGE_{WAIT_BEFORE|WRITE|_WAIT_AFTER}.  Those users explicitly
requested to wait for in-flight IO as well as to writeback of dirty pages.

Re-brand that flag combination as SYNC_FILE_RANGE_WRITE_AND_WAIT and use
WB_SYNC_ALL writeback to perform the full range sync request.

Link: http://lkml.kernel.org/r/20190409114922.30095-1-amir73il@gmail.com
Link: http://lkml.kernel.org/r/20190419072938.31320-1-amir73il@gmail.com
Fixes: 23d0127096 ("fs/sync.c: make sync_file_range(2) use WB_SYNC_NONE")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Jan Kara <jack@suse.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Souptick Joarder 5326905798 xen/privcmd-buf.c: convert to use vm_map_pages_zero()
Convert to use vm_map_pages_zero() to map range of kernel memory to user
vma.

This driver has ignored vm_pgoff.  We could later "fix" these drivers to
behave according to the normal vm_pgoff offsetting simply by removing the
_zero suffix on the function name and if that causes regressions, it gives
us an easy way to revert.

Link: http://lkml.kernel.org/r/acf678e81d554d01a9b590716ac0ccbdcdf71c25.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Souptick Joarder df9bde015a xen/gntdev.c: convert to use vm_map_pages()
Convert to use vm_map_pages() to map range of kernel memory to user vma.

map->count is passed to vm_map_pages() and internal API verify map->count
against count ( count = vma_pages(vma)) for page array boundary overrun
condition.

Link: http://lkml.kernel.org/r/88e56e82d2db98705c2d842e9c9806c00b366d67.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Souptick Joarder a17ae14766 videobuf2/videobuf2-dma-sg.c: convert to use vm_map_pages()
Convert to use vm_map_pages() to map range of kernel memory to user vma.

vm_pgoff is treated in V4L2 API as a 'cookie' to select a buffer, not as a
in-buffer offset by design and it always want to mmap a whole buffer from
its beginning.

Link: http://lkml.kernel.org/r/a953fe6b3056de1cc6eab654effdd4a22f125375.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Suggested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Souptick Joarder b0d0084fd9 iommu/dma-iommu.c: convert to use vm_map_pages()
Convert to use vm_map_pages() to map range of kernel memory to user vma.

Link: http://lkml.kernel.org/r/80c3d220fc6ada73a88ce43ca049afb55a889258.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Souptick Joarder e60b72b1a9 drm/xen/xen_drm_front_gem.c: convert to use vm_map_pages()
Convert to use vm_map_pages() to map range of kernel memory to user vma.

Link: http://lkml.kernel.org/r/ff8e10ba778d79419c66ee8215bccf01560540fd.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Souptick Joarder 2f69b3c8ba drm/rockchip/rockchip_drm_gem.c: convert to use vm_map_pages()
Convert to use vm_map_pages() to map range of kernel memory to user vma.

Tested on Rockchip hardware and display is working, including talking to
Lima via prime.

Link: http://lkml.kernel.org/r/7ba359eb1aceac388d05983c1f29b915bdf291f9.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Souptick Joarder 22660db892 drivers/firewire/core-iso.c: convert to use vm_map_pages_zero()
Convert to use vm_map_pages_zero() to map range of kernel memory to user
vma.

This driver has ignored vm_pgoff and mapped the entire pages.  We could
later "fix" these drivers to behave according to the normal vm_pgoff
offsetting simply by removing the _zero suffix on the function name and if
that causes regressions, it gives us an easy way to revert.

Link: http://lkml.kernel.org/r/88645f5ea8202784a8baaf389e592aeb8c505e8e.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Souptick Joarder 6248461d21 arm: mm: dma-mapping: convert to use vm_map_pages()
Convert to use vm_map_pages() to map range of kernel memory to user vma.

Link: http://lkml.kernel.org/r/936e5e107c746a7310e3a3c471188ca3ac8f9754.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Souptick Joarder a667d7456f mm: introduce new vm_map_pages() and vm_map_pages_zero() API
Patch series "mm: Use vm_map_pages() and vm_map_pages_zero() API", v5.

This patch (of 5):

Previouly drivers have their own way of mapping range of kernel
pages/memory into user vma and this was done by invoking vm_insert_page()
within a loop.

As this pattern is common across different drivers, it can be generalized
by creating new functions and using them across the drivers.

vm_map_pages() is the API which can be used to map kernel memory/pages in
drivers which have considered vm_pgoff

vm_map_pages_zero() is the API which can be used to map a range of kernel
memory/pages in drivers which have not considered vm_pgoff.  vm_pgoff is
passed as default 0 for those drivers.

We _could_ then at a later "fix" these drivers which are using
vm_map_pages_zero() to behave according to the normal vm_pgoff offsetting
simply by removing the _zero suffix on the function name and if that
causes regressions, it gives us an easy way to revert.

Tested on Rockchip hardware and display is working, including talking to
Lima via prime.

Link: http://lkml.kernel.org/r/751cb8a0f4c3e67e95c58a3b072937617f338eea.1552921225.git.jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Suggested-by: Russell King <linux@armlinux.org.uk>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Bartlomiej Zolnierkiewicz 62afcd1cb8 mm: remove redundant 'default n' from Kconfig-s
'default n' is the default value for any bool or tristate Kconfig
setting so there is no need to write it explicitly.

Also since commit f467c5640c ("kconfig: only write '# CONFIG_FOO
is not set' for visible symbols") the Kconfig behavior is the same
regardless of 'default n' being present or not:

    ...
    One side effect of (and the main motivation for) this change is making
    the following two definitions behave exactly the same:

        config FOO
                bool

        config FOO
                bool
                default n

    With this change, neither of these will generate a
    '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
    That might make it clearer to people that a bare 'default n' is
    redundant.
    ...

Link: http://lkml.kernel.org/r/c3385916-e4d4-37d3-b330-e6b7dff83a52@samsung.com
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00
Johannes Weiner 8c7829b04c mm: fix false-positive OVERCOMMIT_GUESS failures
With the default overcommit==guess we occasionally run into mmap
rejections despite plenty of memory that would get dropped under
pressure but just isn't accounted reclaimable. One example of this is
dying cgroups pinned by some page cache. A previous case was auxiliary
path name memory associated with dentries; we have since annotated
those allocations to avoid overcommit failures (see d79f7aa496 ("mm:
treat indirectly reclaimable memory as free in overcommit logic")).

But trying to classify all allocated memory reliably as reclaimable
and unreclaimable is a bit of a fool's errand. There could be a myriad
of dependencies that constantly change with kernel versions.

It becomes even more questionable of an effort when considering how
this estimate of available memory is used: it's not compared to the
system-wide allocated virtual memory in any way. It's not even
compared to the allocating process's address space. It's compared to
the single allocation request at hand!

So we have an elaborate left-hand side of the equation that tries to
assess the exact breathing room the system has available down to a
page - and then compare it to an isolated allocation request with no
additional context. We could fail an allocation of N bytes, but for
two allocations of N/2 bytes we'd do this elaborate dance twice in a
row and then still let N bytes of virtual memory through. This doesn't
make a whole lot of sense.

Let's take a step back and look at the actual goal of the
heuristic. From the documentation:

   Heuristic overcommit handling. Obvious overcommits of address
   space are refused. Used for a typical system. It ensures a
   seriously wild allocation fails while allowing overcommit to
   reduce swap usage.  root is allowed to allocate slightly more
   memory in this mode. This is the default.

If all we want to do is catch clearly bogus allocation requests
irrespective of the general virtual memory situation, the physical
memory counter-part doesn't need to be that complicated, either.

When in GUESS mode, catch wild allocations by comparing their request
size to total amount of ram and swap in the system.

Link: http://lkml.kernel.org/r/20190412191418.26333-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:50 -07:00