redonkable/alistair23-linux

Author	SHA1	Message	Date
Bjorn Helgaas	68feac87de	PCI: add pci_common_swizzle() for INTx swizzling This patch adds pci_common_swizzle(), which swizzles INTx values all the way up to a root bridge. This common implementation can replace several architecture-specific ones. This should someday be combined with pci_get_interrupt_pin(), but I left it separate for now to make reviewing easier. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:12 -08:00
Kenji Kaneshige	e8c331e963	PCI hotplug: introduce functions for ACPI slot detection Some ACPI related PCI hotplug code can be shared among PCI hotplug drivers. This patch introduces the following functions in drivers/pci/hotplug/acpi_pcihp.c to share the code, and changes acpiphp and pciehp to use them. - int acpi_pci_detect_ejectable(struct pci_bus pbus) This checks if the specified PCI bus has ejectable slots. - int acpi_pci_check_ejectable(struct pci_bus pbus, acpi_handle handle) This checks if the specified handle is ejectable ACPI PCI slot. The 'pbus' parameter is needed to check if 'handle' is PCI related ACPI object. This patch also introduces the following inline function in include/linux/pci-acpi.h, which is useful to get ACPI handle of the PCI bridge from struct pci_bus of the bridge's secondary bus. - static inline acpi_handle acpi_pci_get_bridge_handle(struct pci_bus *pbus) This returns ACPI handle of the PCI bridge which generates PCI bus specified by 'pbus'. Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:11 -08:00
Yu Zhao	fde09c6d8f	PCI: define PCI resource names in an 'enum' This patch moves all definitions of the PCI resource names to an 'enum', and also replaces some hard-coded resource variables with symbol names. This change eases introduction of device specific resources. Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:01 -08:00
Yu Zhao	14add80b51	PCI: remove unnecessary arg of pci_update_resource() This cleanup removes unnecessary argument 'struct resource *res' in pci_update_resource(), so it takes same arguments as other companion functions (pci_assign_resource(), etc.). Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:13:00 -08:00
Andrew Morton	1684f5ddd4	PCI: uninline pci_ioremap_bar() It's too large to be inlined. Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:57 -08:00
Bjorn Helgaas	57c2cf71c1	PCI: add pci_swizzle_interrupt_pin() This patch adds pci_swizzle_interrupt_pin(), which implements the INTx swizzling algorithm specified in Table 9-1 of the "PCI-to-PCI Bridge Architecture Specification," revision 1.2. There are many architecture-specific implementations of this swizzle that can be replaced by this common one. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:50 -08:00
Arjan van de Ven	e8de1481fd	resource: allow MMIO exclusivity for device drivers Device drivers that use pci_request_regions() (and similar APIs) have a reasonable expectation that they are the only ones accessing their device. As part of the e1000e hunt, we were afraid that some userland (X or some bootsplash stuff) was mapping the MMIO region that the driver thought it had exclusively via /dev/mem or via various sysfs resource mappings. This patch adds the option for device drivers to cause their reserved regions to the "banned from /dev/mem use" list, so now both kernel memory and device-exclusive MMIO regions are banned. NOTE: This is only active when CONFIG_STRICT_DEVMEM is set. In addition to the config option, a kernel parameter iomem=relaxed is provided for the cases where developers want to diagnose, in the field, drivers issues from userspace. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:32 -08:00
Andrew Patterson	2361694191	ACPI/PCI: remove obsolete _OSC capability support functions The acpi_query_osc, __pci_osc_support_set, pci_osc_support_set, and pcie_osc_support_set functions have been obsoleted in favor of setting these capabilities during root bridge discovery with pci_acpi_osc_support. There are no longer any callers of these functions, so remove them. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:32 -08:00
Andrew Patterson	07ae95f988	ACPI/PCI: PCI MSI _OSC support capabilities called when root bridge added The _OSC capability OSC_MSI_SUPPORT is set when the root bridge is added with pci_acpi_osc_support(), so we no longer need to do it in the PCI MSI driver. Also adds the function pci_msi_enabled, which returns true if pci=nomsi is not on the kernel command-line. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:31 -08:00
Andrew Patterson	3e1b16002a	ACPI/PCI: PCIe ASPM _OSC support capabilities called when root bridge added The _OSC capabilities OSC_ACTIVE_STATE_PWR_SUPPORT and OSC_CLOCK_PWR_CAPABILITY_SUPPORT are set when the root bridge is added with pci_acpi_osc_support(), so we no longer need to do it in the ASPM driver. Also add the function pcie_aspm_enabled, which returns true if pcie_aspm=off is not on the kernel command-line. Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:29 -08:00
Andrew Patterson	0ef5f8f615	ACPI/PCI: PCI extended config _OSC support called when root bridge added The _OSC capability OSC_EXT_PCI_CONFIG_SUPPORT is set when the root bridge is added with pci_acpi_osc_support() if we can access PCI extended config space. This adds the function pci_ext_cfg_avail which returns true if we can access PCI extended config space (offset greater than 0xff). It currently only returns false if arch=x86 and raw_pci_ext_ops is not set (which might happen if pci=nommcfg is set on the kernel command-line). Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:28 -08:00
Andrew Patterson	990a7ac564	ACPI/PCI: call _OSC support during root bridge discovery Add pci_acpi_osc_support() and call it when a PCI bridge is added. This allows us to avoid having every individual PCI root bridge driver call _OSC support for every root bridge in their probe functions, a significant savings in boot time. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:27 -08:00
Andrew Patterson	8b62091e20	ACPI/PCI: include missing acpi.h file in pci-acpi.h. The pci-acpi.h file will not compile without including linux/acpi.h. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:26 -08:00
Sheng Yang	f7b7baae6b	PCI: add PCI Advanced Feature Capability defines PCI Advanced Features Capability is introduced by "Conventional PCI Advanced Caps ECN" (can be downloaded in pcisig.com). Add defines for the various AF capabilities, including function level reset (FLR). Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:24 -08:00
Inaky Perez-Gonzalez	e306987434	wimax: export linux/wimax.h and linux/wimax/i2400m.h with headers_install These two files are what user space can use to establish communication with the WiMAX kernel API and to speak the Intel 2400m Wireless WiMAX connection's control protocol. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:22 -08:00
Inaky Perez-Gonzalez	ea24652d25	i2400m: host/device procotol and core driver definitions The wimax/i2400m.h defines the structures and constants for the host-device protocols: - boot / firmware upload protocol - general data transport protocol - control protocol It is done in such a way that can also be used verbatim by user space. drivers/net/wimax/i2400m.h defines all the APIs used by the core, bus-generic driver (i2400m) and the bus specific drivers (i2400m-BUSNAME). It also gives a roadmap to the driver implementation. debug-levels.h adds the core driver's debug settings. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:18 -08:00
Inaky Perez-Gonzalez	ea912f4e7f	wimax: debug macros and debug settings for the WiMAX stack This file contains a simple debug framework that is used in the stack; it allows the debug level to be controlled at compile-time (so the debug code is optimized out) and at run-time (for what wasn't compiled out). This is eventually going to be moved to use dynamic_printk(). Just need to find time to do it. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:17 -08:00
Inaky Perez-Gonzalez	ace22f0881	wimax: headers for kernel API and user space interaction Definitions for the user/kernel API protocol through generic netlink. User space can copy it verbatim and use it. Kernel API definition declares the main data types and calls for the drivers to integrate into the WiMAX stack. Provides usage documentation. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:16 -08:00
Inaky Perez-Gonzalez	5e07878787	debugfs: add helpers for exporting a size_t simple value In the same spirit as debugfs_create_*(), introduce helpers for exporting size_t values over debugfs. The only trick done is that the format verifier is kept at %llu instead of %zu; otherwise type warnings would pop up: format ‘%zu’ expects type ‘size_t’, but argument 2 has type ‘long long unsigned int’ There is no real way to fix this one--however, we can consider %llu and %zu to be compatible if we consider that we are using the same for validating in debugfs_create_{x,u}{8,16,32}(). Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:16 -08:00
Greg Kroah-Hartman	34c65d82e0	USB: remove info() macro from usb.h USB should not be having it's own printk macros, so remove info() and use the system-wide standard of dev_info() wherever possible. No one in the tree is using the macro, so it can now be removed. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:14 -08:00
Greg Kroah-Hartman	338b67b0c1	USB: remove warn() macro from usb.h USB should not be having it's own printk macros, so remove warn() and use the system-wide standard of dev_warn() wherever possible. In the few places that will not work out, use a basic printk(). Now that all in-tree users are gone, remove the macro. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:14 -08:00
Alan Stern	25ff1c316f	USB: storage: add last-sector hacks This patch (as1189b) adds some hacks to usb-storage for dealing with the growing problems involving bad capacity values and last-sector accesses: A new flag, US_FL_CAPACITY_OK, is created to indicate that the device is known to report its capacity correctly. An unusual_devs entry for Linux's own File-backed Storage Gadget is added with this flag set, since g_file_storage always reports the correct capacity and since the capacity need not be even (it is determined by the size of the backing file). An entry in unusual_devs.h which has only the CAPACITY_OK flag set shouldn't prejudice libusual, since the device will work perfectly well with either usb-storage or ub. So a new macro, COMPLIANT_DEV, is added to let libusual know about these entries. When a last-sector access succeeds and the total number of sectors is odd (the unexpected case, in which guessing that the number is even might cause trouble), a WARN is triggered. The kerneloops.org project will collect these warnings, allowing us to add CAPACITY_OK flags for the devices in question before implementing the default-to-even heuristic. If users want to prevent the stack dump produced by the WARN, they can disable the hack by adding an unusual_devs entry for their device with the CAPACITY_OK flag. When a last-sector access fails three times in a row and neither the FIX_CAPACITY nor the CAPACITY_OK flag is set, we assume the last-sector bug is present. We replace the existing status and sense data with values that will cause the SCSI core to fail the access immediately rather than retry indefinitely. This should fix the difficulties people have been having with Nokia phones. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:11 -08:00
Oliver Neukum	856395d6e1	USB: extension of anchor API to unpoison an anchor This extension allows unpoisoning an anchor allowing drivers that resubmit URBs to reuse an anchor for methods like resume() Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:11 -08:00
Ming Lei	49367d8f1d	USB: mark "reject" field of struct urb as atomic_t It is enough to protect accesses to reject field of urb by marking it as atomic_t,also it is the only reason of existence of usb_reject_lock,so remove the lock to make code more clean. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Acked-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:08 -08:00
Alan Stern	3b23dd6f8a	USB: utilize the bus notifiers This patch (as1185) makes usbcore take advantage of the bus notifications sent out by the driver core. Now we can create all our device and interface attribute files before the device or interface uevent is broadcast. A side effect is that we no longer create the endpoint "pseudo" devices at the same time as a device or interface is registered -- it seems like a bad idea to try registering an endpoint before the registration of its parent is complete. So the routines for creating and removing endpoint devices have been split out and renamed, and they are called explicitly when needed. A new bitflag is used for keeping track of whether or not the interface's endpoint devices have been created, since (just as with the interface attributes) they vary with the altsetting and hence can be changed at random times. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:08 -08:00
Bryan Wu	2ffcdb3bda	USB: musb: use new platform data interface of musb to replace old one Signed-off-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:06 -08:00
Alan Stern	65bfd2967c	USB: Enhance usage of pm_message_t This patch (as1177) modifies the USB core suspend and resume routines. The resume functions now will take a pm_message_t argument, so they will know what sort of resume is occurring. The new argument is also passed to the port suspend/resume and bus suspend/resume routines (although they don't use it for anything but debugging). In addition, special pm_message_t values are used for user-initiated, device-initiated (i.e., remote wakeup), and automatic suspend/resume. By testing these values, drivers can tell whether or not a particular suspend was an autosuspend. Unfortunately, they can't do the same for resumes -- not until the pm_message_t argument is also passed to the drivers' resume methods. That will require a bigger change. IMO, the whole Power Management framework should have been set up this way in the first place. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:03 -08:00
Philipp Zabel	68144e0cc9	USB: otg: add otg_put_transceiver() As Russell King points out, calling put_device(otg_transceiver->dev) directly in driver cleanup paths makes assumptions about otg_transceiver internals. Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:02 -08:00
Philipp Zabel	6084f1bf0c	USB: otg: gpio_vbus transceiver stub gpio_vbus provides simple GPIO VBUS sensing for peripheral controllers with an internal transceiver. Optionally, a second GPIO can be used to control D+ pullup. It also interfaces with the regulator framework to limit charging currents when powered via USB. gpio_vbus requests the regulator supplying "vbus_draw" and can enable/disable it or limit its current depending on USB state. [dbrownell@users.sourceforge.net: use drivers/otg, cleanups ] Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 10:00:02 -08:00
Ben Efros	1537e0ad94	USB: storage devices and SAT Add the SANE SENSE flag to indicate that a device is capable of handling more than 18-bytes of sense data. This functionality is required for USB-ATA bridges implementing SAT. A future patch will actually enable this function for several devices. The logic behind this is that we can detect support for SANE_SENSE in a few ways: 1) ATA PASS THROUGH (12) or (16) execute successfully 2) SPC-3 or higher is in use 3) A previous CHECK CONDITION occurred with sense format 70-73 and had a length greater than 18-bytes total Signed-off-by: Ben Efros <ben@pc-doctor.com> Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:55 -08:00
Pete Zaitcev	f150fa1afb	USB: Allow usbmon as a module even if usbcore is builtin usbmon can only be built as a module if usbcore is a module too. Trivial changes to the relevant Kconfig and Makefile (and a few trivial changes elsewhere) allow usbmon to be built as a module even if usbcore is builtin. This is verified to work in all 9 permutations (3 correctly prohibited by Kconfig, 6 build a suitable result). Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:54 -08:00
Inaky Perez-Gonzalez	dc023dceec	USB: Introduce usb_queue_reset() to do resets from atomic contexts This patch introduces a new call to be able to do a USB reset from an atomic contect. This is quite helpful in USB callbacks to handle errors (when the only thing that can be done is to do a device reset). It is done queuing a work struct that will do the actual reset. The struct is "attached" to an interface so pending requests from an interface are removed when said interface is unbound from the driver. The call flow then becomes: usb_queue_reset_device() __usb_queue_reset_device() [workqueue] usb_reset_device() usb_probe_interface() usb_cancel_queue_reset() [error path] usb_unbind_interface() usb_cancel_queue_reset() usb_driver_release_interface() usb_cancel_queue_reset() Note usb_cancel_queue_reset() needs smarts to try not to unqueue when it is actually being executed. This happens when we run the reset from the workqueue: usb_reset_device() is called and on interface unbind time, usb_cancel_queue_reset() would be called. That would deadlock on cancel_work_sync(). To avoid that, we set (before running usb_reset_device()) usb_intf->reset_running and clear it inmediately after returning. Patch is against 2.6.28-rc2 and depends on http://marc.info/?l=linux-usb&m=122581634925308&w=2 (as submitted by Alan Stern). Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:53 -08:00
Alan Stern	9ac39f28b5	USB: add asynchronous autosuspend/autoresume support This patch (as1160b) adds support routines for asynchronous autosuspend and autoresume, with accompanying documentation updates. There already are several potential users of this interface, and others are likely to arise as autosuspend support becomes more widespread. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:53 -08:00
Harvey Harrison	d767d88875	USB: wusb: annotate association types withe proper endianness Also a trivial annotation in rh.c for: drivers/usb/wusbcore/rh.c:366:9: warning: incorrect type in assignment (different base types) drivers/usb/wusbcore/rh.c:366:9: expected unsigned short [unsigned] [short] [usertype] <noident> drivers/usb/wusbcore/rh.c:366:9: got restricted __le16 [usertype] <noident> drivers/usb/wusbcore/rh.c:367:9: warning: incorrect type in assignment (different base types) drivers/usb/wusbcore/rh.c:367:9: expected unsigned short [unsigned] [short] [usertype] <noident> drivers/usb/wusbcore/rh.c:367:9: got restricted __le16 [usertype] <noident> Association types annotation fixes piles of warnings similar to: drivers/usb/wusbcore/cbaf.c:238:30: warning: incorrect type in initializer (different base types) drivers/usb/wusbcore/cbaf.c:238:30: expected restricted __le16 [usertype] id drivers/usb/wusbcore/cbaf.c:238:30: got int drivers/usb/wusbcore/cbaf.c:238:30: warning: incorrect type in initializer (different base types) drivers/usb/wusbcore/cbaf.c:238:30: expected restricted __le16 [usertype] len drivers/usb/wusbcore/cbaf.c:238:30: got int Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: David Vrabel <david.vrabel@csr.com> Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:51 -08:00
Rodolfo Giometti	b92a78e582	usb host: Oxford OXU210HP HCD driver. This driver implements the support for Oxford OXU210HP USB high-speed host, no peripheral nor OTG. Signed-off-by: Rodolfo Giometti <giometti@linux.it> Cc: Kan Liu <kan.k.liu@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-07 09:59:50 -08:00
Arjan van de Ven	efaee19206	async: make the final inode deletion an asynchronous event this makes "rm -rf" on a (names cached) kernel tree go from 11.6 to 8.6 seconds on an ext3 filesystem Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2009-01-07 08:47:24 -08:00
Arjan van de Ven	22a9d64567	async: Asynchronous function calls to speed up kernel boot Right now, most of the kernel boot is strictly synchronous, such that various hardware delays are done sequentially. In order to make the kernel boot faster, this patch introduces infrastructure to allow doing some of the initialization steps asynchronously, which will hide significant portions of the hardware delays in practice. In order to not change device order and other similar observables, this patch does NOT do full parallel initialization. Rather, it operates more in the way an out of order CPU does; the work may be done out of order and asynchronous, but the observable effects (instruction retiring for the CPU) are still done in the original sequence. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2009-01-07 08:45:46 -08:00
Jean Delvare	b305271861	i2c: Drop I2C_CLASS_CAM_DIGITAL There are a number of drivers which set their i2c bus class to I2C_CLASS_CAM_DIGITAL, however no chip driver actually checks for this flag, so we might as well drop it now. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2009-01-07 14:29:17 +01:00
Jean Delvare	994a075f0f	i2c: Drop I2C_CLASS_CAM_ANALOG and I2C_CLASS_SOUND There are no users left of these two i2c probe class flags so we can drop the now. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2009-01-07 14:29:17 +01:00
Jean Delvare	e1995f65be	i2c: Drop I2C_CLASS_ALL I2C_CLASS_ALL is almost never what bus driver authors really want. These i2c classes are really only about which devices must be probed, not what devices can be present. As device drivers get converted to the new i2c device driver model, only a few device types will keep relying on probing. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Sonic Zhang <sonic.zhang@analog.com>	2009-01-07 14:29:16 +01:00
Haavard Skinnemoen	52435bfc66	Merge branches 'fixes', 'cleanups' and 'boards'	2009-01-07 11:05:42 +01:00
Linus Torvalds	ede6f5aea0	Fix up 64-bit byte swaps for most 32-bit architectures The __SWAB_64_THRU_32__ case of a 64-bit byte swap was depending on the no-longer-existant ___swab32() method (three underscores). We got rid of some of the worst indirection and complexity, and now it should just use the 32-bit swab function that was defined right above it. Reported-and-tested-by: Nicolas Pitre <nico@cam.org> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 21:17:57 -08:00
Harvey Harrison	637b180c23	byteorder: remove the now unused byteorder.h This implementation caused problems in userspace which can, and does define _both_ __LITTLE_ENDIAN and __BIG_ENDIAN. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 19:45:13 -08:00
Harvey Harrison	991c0e6d1a	byteorder: only use linux/swab.h The first step to make swab.h a regular header that will include an asm/swab.h with arch overrides. Avoid the gratuitous differences introduced in the new linux/swab.h by naming the ___constant_swabXX bits and __fswabXX bits exactly as found in the old implementation in byteorder/swab[b].h Use this new swab.h in byteorder/[big\|little]_endian.h and remove the two old swab headers. Although the inclusion of asm/byteorder.h looks strange in linux/swab.h, this will allow each arch to move the actual arch overrides for the swab bits in an asm file and then the includes can be cleaned up without requiring a flag day for all arches at once. Keep providing __fswabXX in case some userspace was using them directly, but the revised __swabXX should be used instead in any new code and will always do constant folding not dependent on the optimization level, which means the __constant versions can be phased out in-kernel. Arches that use the old-style arch macros will lose their optimized versions until they move to the new style, but at least they will still compile. Many arches have already moved and the patches to move the remaining arches are trivial. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 18:10:26 -08:00
Linus Torvalds	db30c70575	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (29 commits) Input: i8042 - add Dell Vostro 1510 to nomux list Input: gtco - use USB endpoint API Input: add support for Maple controller as a joystick Input: atkbd - broaden the Dell DMI signatures Input: HIL drivers - add MODULE_ALIAS() Input: map_to_7segment.h - convert to __inline__ for userspace Input: add support for enhanced rotary controller on pxa930 and pxa935 Input: add support for trackball on pxa930 and pxa935 Input: add da9034 touchscreen support Input: ads7846 - strict_strtoul takes unsigned long Input: make some variables and functions static Input: add tsc2007 based touchscreen driver Input: psmouse - add module parameters to control OLPC touchpad delays Input: i8042 - add Gigabyte M912 netbook to noloop exception table Input: atkbd - Samsung NC10 key repeat fix Input: atkbd - add keyboard quirk for HP Pavilion ZV6100 laptop Input: libps2 - handle 0xfc responses from devices Input: add support for Wacom W8001 penabled serial touchscreen Input: synaptics - report multi-taps only if supported by the device Input: add joystick driver for Walkera WK-0701 RC transmitter ...	2009-01-06 17:14:01 -08:00
Linus Torvalds	c861ea2cb2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3] Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]" SELinux: shrink sizeof av_inhert selinux_class_perm and context CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2] keys: fix sparse warning by adding __user annotation to cast smack: Add support for unlabeled network hosts and networks selinux: Deprecate and schedule the removal of the the compat_net functionality netlabel: Update kernel configuration API	2009-01-06 17:11:39 -08:00
Linus Torvalds	3610639d1f	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimer: splitout peek ahead functionality, fix hrtimer: fixup comments hrtimer: fix recursion deadlock by re-introducing the softirq hrtimer: simplify hotplug migration hrtimer: fix HOTPLUG_CPU=n compile warning hrtimer: splitout peek ahead functionality	2009-01-06 17:10:53 -08:00
Linus Torvalds	cfa97f993c	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: fix section mismatch sched: fix double kfree in failure path sched: clean up arch_reinit_sched_domains() sched: mark sched_create_sysfs_power_savings_entries() as __init getrusage: RUSAGE_THREAD should return ru_utime and ru_stime sched: fix sched_slice() sched_clock: prevent scd->clock from moving backwards, take #2 sched: sched.c declare variables before they get used	2009-01-06 17:10:33 -08:00
Linus Torvalds	7238eb4ca3	Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: provide irq_to_desc() to non-genirq architectures too	2009-01-06 17:10:19 -08:00
Linus Torvalds	f94181da71	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: fix rcutorture bug rcu: eliminate synchronize_rcu_xxx macro rcu: make treercu safe for suspend and resume rcu: fix rcutree grace-period-latency bug on small systems futex: catch certain assymetric (get\|put)_futex_key calls futex: make futex_(get\|put)_key() calls symmetric locking, percpu counters: introduce separate lock classes swiotlb: clean up EXPORT_SYMBOL usage swiotlb: remove unnecessary declaration swiotlb: replace architecture-specific swiotlb.h with linux/swiotlb.h swiotlb: add support for systems with highmem swiotlb: store phys address in io_tlb_orig_addr array swiotlb: add hwdev to swiotlb_phys_to_bus() / swiotlb_sg_to_bus()	2009-01-06 17:10:04 -08:00
Linus Torvalds	40d7ee5d16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (60 commits) uio: make uio_info's name and version const UIO: Documentation for UIO ioport info handling UIO: Pass information about ioports to userspace (V2) UIO: uio_pdrv_genirq: allow custom irq_flags UIO: use pci_ioremap_bar() in drivers/uio arm: struct device - replace bus_id with dev_name(), dev_set_name() libata: struct device - replace bus_id with dev_name(), dev_set_name() avr: struct device - replace bus_id with dev_name(), dev_set_name() block: struct device - replace bus_id with dev_name(), dev_set_name() chris: struct device - replace bus_id with dev_name(), dev_set_name() dmi: struct device - replace bus_id with dev_name(), dev_set_name() gadget: struct device - replace bus_id with dev_name(), dev_set_name() gpio: struct device - replace bus_id with dev_name(), dev_set_name() gpu: struct device - replace bus_id with dev_name(), dev_set_name() hwmon: struct device - replace bus_id with dev_name(), dev_set_name() i2o: struct device - replace bus_id with dev_name(), dev_set_name() IA64: struct device - replace bus_id with dev_name(), dev_set_name() i7300_idle: struct device - replace bus_id with dev_name(), dev_set_name() infiniband: struct device - replace bus_id with dev_name(), dev_set_name() ISDN: struct device - replace bus_id with dev_name(), dev_set_name() ...	2009-01-06 17:02:07 -08:00
Linus Torvalds	5fec8bdbf9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: clean up annotations of fc->lock fuse: fix sparse warning in ioctl fuse: update interface version fuse: add fuse_conn->release() fuse: separate out fuse_conn_init() from new_conn() fuse: add fuse_ prefix to several functions fuse: implement poll support fuse: implement unsolicited notification fuse: add file kernel handle fuse: implement ioctl support fuse: don't let fuse_req->end() put the base reference fuse: move FUSE_MINOR to miscdevice.h fuse: style fixes	2009-01-06 17:01:20 -08:00
Linus Torvalds	59e3af21e9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (41 commits) scc_pata: make use of scc_dma_sff_read_status() ide-dma-sff: factor out ide_dma_sff_write_status() ide: move read_sff_dma_status() method to 'struct ide_dma_ops' ide: don't set hwif->dma_ops in init_dma() method Resurrect IT8172 IDE controller driver piix: sync ich_laptop[] with ata_piix.c ide: update warm-plug HOWTO ide: fix ide_port_scan() to do ACPI setup after initializing request queues ide: remove now redundant ->cur_dev checks ide: remove unused ide_hwif_t.sg_mapped field ide: struct ide_atapi_pc - remove unused fields and update documentation ide: remove superfluous hwif variable assignment from ide_timer_expiry() ide: use ide_pci_is_in_compatibility_mode() helper in setup-pci.c ide: make "paranoia" ->handler check in ide_intr() more strict ide-cd: convert to ide-atapi facilities ide-cd: start DMA before sending the actual packet command ide-cd: wait for DRQ to get set per default ide: Fix drive's DWORD-IO handling ide: add port and host iterators ide: dynamic allocation of device structures ...	2009-01-06 17:00:50 -08:00
WANG Cong	8cd3ac3aca	fs/exec.c: make do_coredump() void No one cares do_coredump()'s return value, and also it seems that it is also not necessary. So make it void. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: WANG Cong <wangcong@zeuux.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:29 -08:00
Randy Dunlap	d3635abfee	rapidio: remove excess kernel-doc notation Remove excess kernel-doc notation from rio header and driver: Warning(include/linux/rio_drv.h:399): Excess function parameter or struct member 'buffer' description in 'rio_get_inb_message' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:28 -08:00
David Brownell	cabb3fc4bd	twl4030-gpio: cleanup debounce Provide a static debounce configuration mechanism for twl4030 GPIOs, replacing the previous dynamic one. The single user of that mechanism was for MMC card detect debouncing. Boards can provide a bitmask saying which GPIOs to debounce (30 msec). It's always enabled for pins with the MMC card-detect/VMMCx link active, so most boards won't need to set the debounce mask. This is a net code shrink, including runtime footprint. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:25 -08:00
Ian Kent	a92daf6ba1	autofs4: make autofs type usage explicit - the type assigned at mount when no type is given is changed from 0 to AUTOFS_TYPE_INDIRECT. This was done because 0 and AUTOFS_TYPE_INDIRECT were being treated implicitly as the same type. - previously, an offset mount had it's type set to AUTOFS_TYPE_DIRECT\|AUTOFS_TYPE_OFFSET but the mount control re-implementation needs to be able distinguish all three types. So this was changed to make the type setting explicit. - a type AUTOFS_TYPE_ANY was added for use by the re-implementation when checking if a given path is a mountpoint. It's not really a type as we use this to ask if a given path is a mountpoint in the autofs_dev_ioctl_ismountpoint() function. - functions to set and test the autofs mount types have been added to improve readability and make the type usage explicit. - the mount type is used from user space for the mount control re-implementtion so, for consistency, all the definitions have been moved to the user space include file include/linux/auto_fs4.h. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:23 -08:00
Ian Kent	730c9eeca9	autofs4: improve parameter usage The parameter usage in the device node ioctl code uses arg1 and arg2 as parameter names. This patch redefines the parameter names to reflect what they actually are in an effort to make the code more readable. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:23 -08:00
Masami Hiramatsu	e8386a0cb2	kprobes: support probing module __exit function Allows kprobes to probe __exit routine. This adds flags member to struct kprobe. When module is freed(kprobes hooks module_notifier to get this event), kprobes which probe the functions in that module are set to "Gone" flag to the flags member. These "Gone" probes are never be enabled. Users can check the GONE flag through debugfs. This also removes mod_refcounted, because we couldn't free a module if kprobe incremented the refcount of that module. [akpm@linux-foundation.org: document some locking] [mhiramat@redhat.com: bugfix: pass aggr_kprobe to arch_remove_kprobe] [mhiramat@redhat.com: bugfix: release old_p's insn_slot before error return] Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:21 -08:00
Masami Hiramatsu	1294156078	kprobes: add kprobe_insn_mutex and cleanup arch_remove_kprobe() Add kprobe_insn_mutex for protecting kprobe_insn_pages hlist, and remove kprobe_mutex from architecture dependent code. This allows us to call arch_remove_kprobe() (and free_insn_slot) while holding kprobe_mutex. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:20 -08:00
Masami Hiramatsu	a06f6211ef	module: add within_module_core() and within_module_init() This series of patches allows kprobes to probe module's __init and __exit functions. This means, you can probe driver initialization and terminating. Currently, kprobes can't probe __init function because these functions are freed after module initialization. And it also can't probe module __exit functions because kprobe increments reference count of target module and user can't unload it. this means __exit functions never be called unless removing probes from the module. To solve both cases, this series of patches introduces GONE flag and sets it when the target code is freed(for this purpose, kprobes hooks MODULE_STATE_* events). This also removes refcount incrementing for allowing user to unload target module. Users can check which probes are GONE by debugfs interface. For taking timing of freeing module's .init text, these also include a patch which adds module's notifier of MODULE_STATE_LIVE event. This patch: Add within_module_core() and within_module_init() for checking whether an address is in the module .init.text section or .text section, and replace within() local inline functions in kernel/module.c with them. kprobes uses these functions to check where the kprobe is inserted. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:20 -08:00
David Brownell	d29389de0b	spi_gpio driver Generalize the old at91rm9200 "bootstrap" bitbanging SPI master driver as "spi_gpio", so it works with arbitrary GPIOs and can be configured through platform_data. Such SPI masters support: - any number of bus instances (bus_num is the platform_device.id) - any number of chipselects (one GPIO per spi_device) - all four SPI_MODE values, and SPI_CS_HIGH - i/o word sizes from 1 to 32 bits; - devices configured as with any other spi_master controller When configured using platform_data, this provides relatively low clock rates. On platforms that support inlined GPIO calls, significantly improved transfer speeds are also possible with a semi-custom driver. (It's still painful when accessing flash memory, but less so.) Sanity checked by using this version to replace both native controllers on a board with six different SPI slaves, relying on three different SPI_MODE_* values and both SPI_CS_HIGH settings for correct operation. [akpm@linux-foundation.org: cleanups] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Magnus Damm <damm@igel.co.jp> Tested-by: Magnus Damm <damm@igel.co.jp> Cc: Torgil Svensson <torgil.svensson@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:19 -08:00
Hiroshi Shimamoto	5cf0cc4e67	binfmts.h: include list.h linux_binfmt uses list_head, so list.h is needed. [akpm@linux-foundation.org: fix `make headerscheck'] Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:19 -08:00
Jesper Juhl	8c3659347e	include/linux/interrupt.h: do not include linux/irqnr.h twice Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:14 -08:00
Eric Dumazet	179f7ebff6	percpu_counter: FBC_BATCH should be a variable For NR_CPUS >= 16 values, FBC_BATCH is 2NR_CPUS Considering more and more distros are using high NR_CPUS values, it makes sense to use a more sensible value for FBC_BATCH, and get rid of NR_CPUS. A sensible value is 2num_online_cpus(), with a minimum value of 32 (This minimum value helps branch prediction in __percpu_counter_add()) We already have a hotcpu notifier, so we can adjust FBC_BATCH dynamically. We rename FBC_BATCH to percpu_counter_batch since its not a constant anymore. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:13 -08:00
Tejun Heo	5f820f648c	poll: allow f_op->poll to sleep f_op->poll is the only vfs operation which is not allowed to sleep. It's because poll and select implementation used task state to synchronize against wake ups, which doesn't have to be the case anymore as wait/wake interface can now use custom wake up functions. The non-sleep restriction can be a bit tricky because ->poll is not called from an atomic context and the result of accidentally sleeping in ->poll only shows up as temporary busy looping when the timing is right or rather wrong. This patch converts poll/select to use custom wake up function and use separate triggered variable to synchronize against wake up events. The only added overhead is an extra function call during wake up and negligible. This patch removes the one non-sleep exception from vfs locking rules and is beneficial to userland filesystem implementations like FUSE, 9p or peculiar fs like spufs as it's very difficult for those to implement non-sleeping poll method. While at it, make the following cosmetic changes to make poll.h and select.c checkpatch friendly. * s/type * symbol/type symbol/ : three places in poll.h remove blank line before EXPORT_SYMBOL() : two places in select.c Oleg: spotted missing barrier in poll_schedule_timeout() Davide: spotted missing write barrier in pollwake() Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Ingo Molnar <mingo@elte.hu> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Brad Boyer <flar@allandria.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Roland McGrath <roland@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:12 -08:00
Darrick J. Wong	9fe06081ef	Create a DIV_ROUND_CLOSEST macro to do division with rounding Create a helper macro to divide two numbers and round the result to the nearest whole number. This is a helper macro for hwmon drivers that want to convert incoming sysfs values per standard hwmon practice, though the macro itself can be used by anyone. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:12 -08:00
Alexey Dobriyan	f1883f86de	Remove remaining unwinder code Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Gabor Gombas <gombasg@sztaki.hu> Cc: Jan Beulich <jbeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu>, Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:11 -08:00
Matthew Wilcox	ea43546750	atomic_t: unify all arch definitions The atomic_t type cannot currently be used in some header files because it would create an include loop with asm/atomic.h. Move the type definition to linux/types.h to break the loop. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:10 -08:00
Oleg Nesterov	901608d904	mm: introduce get_mm_hiwater_xxx(), fix taskstats->hiwater_xxx accounting xacct_add_tsk() relies on do_exit()->update_hiwater_xxx() and uses mm->hiwater_xxx directly, this leads to 2 problems: - taskstats_user_cmd() can call fill_pid()->xacct_add_tsk() at any moment before the task exits, so we should check the current values of rss/vm anyway. - do_exit()->update_hiwater_xxx() calls are racy. An exiting thread can be preempted right before mm->hiwater_xxx = new_val, and another thread can use A_LOT of memory and exit in between. When the first thread resumes it can be the last thread in the thread group, in that case we report the wrong hiwater_xxx values which do not take A_LOT into account. Introduce get_mm_hiwater_rss() and get_mm_hiwater_vm() helpers and change xacct_add_tsk() to use them. The first helper will also be used by rusage->ru_maxrss accounting. Kill do_exit()->update_hiwater_xxx() calls. Unless we are going to decrease rss/vm there is no point to update mm->hiwater_xxx, and nobody can look at this mm_struct when exit_mmap() actually unmaps the memory. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Hugh Dickins <hugh@veritas.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:09 -08:00
Nick Piggin	856bf4d717	fs: sys_sync fix s_syncing livelock avoidance was breaking data integrity guarantee of sys_sync, by allowing sys_sync to skip writing or waiting for superblocks if there is a concurrent sys_sync happening. This livelock avoidance is much less important now that we don't have the get_super_to_sync() call after every sb that we sync. This was replaced by __put_super_and_need_restart. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:09 -08:00
Nick Piggin	4f5a99d64c	fs: remove WB_SYNC_HOLD Remove WB_SYNC_HOLD. The primary motiviation is the design of my anti-starvation code for fsync. It requires taking an inode lock over the sync operation, so we could run into lock ordering problems with multiple inodes. It is possible to take a single global lock to solve the ordering problem, but then that would prevent a future nice implementation of "sync multiple inodes" based on lock order via inode address. Seems like a backward step to remove this, but actually it is busted anyway: we can't use the inode lists for data integrity wait: an inode can be taken off the dirty lists but still be under writeback. In order to satisfy data integrity semantics, we should wait for it to finish writeback, but if we only search the dirty lists, we'll miss it. It would be possible to have a "writeback" list, for sys_sync, I suppose. But why complicate things by prematurely optimise? For unmounting, we could avoid the "livelock avoidance" code, which would be easier, but again premature IMO. Fixing the existing data integrity problem will come next. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:09 -08:00
Hugh Dickins	edc315fd22	badpage: remove vma from page_remove_rmap Remove page_remove_rmap()'s vma arg, which was only for the Eeek message. And remove the BUG_ON(page_mapcount(page) == 0) from CONFIG_DEBUG_VM's page_dup_rmap(): we're trying to be more resilient about that than BUGs. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:07 -08:00
Hugh Dickins	2509ef26db	badpage: zap print_bad_pte on swap and file Complete zap_pte_range()'s coverage of bad pagetable entries by calling print_bad_pte() on a pte_file in a linear vma and on a bad swap entry. That needs free_swap_and_cache() to tell it, which will also have shown one of those "swap_free" errors (but with much less information). Similar checks in fork's copy_one_pte()? No, that would be more noisy than helpful: we'll see them when parent and child exec or exit. Where do_nonlinear_fault() calls print_bad_pte(): omit !VM_CAN_NONLINEAR case, that could only be a bug in sys_remap_file_pages(), not a bad pte. VM_FAULT_OOM rather than VM_FAULT_SIGBUS? Well, okay, that is consistent with what happens if do_swap_page() operates a bad swap entry; but don't we have patches to be more careful about killing when VM_FAULT_OOM? Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:07 -08:00
Hugh Dickins	79f4b7bf39	badpage: simplify page_alloc flag check+clear Simplify the PAGE_FLAGS checking and clearing when freeing and allocating a page: check the same flags as before when freeing, clear ALL the flags (unless PageReserved) when freeing, check ALL flags off when allocating. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:07 -08:00
Hugh Dickins	20137a490f	swapfile: swapon randomize if nonrot Swap allocation has always started from the beginning of the swap area; but if we're dealing with a solidstate swap device which can only remap blocks within limited zones, that would sooner wear out the first zone. Therefore sys_swapon() test whether blk_queue is non-rotational, and if so randomize the cluster_next starting position for allocation. If blk_queue is nonrot, note SWP_SOLIDSTATE for later use, and report it with an "SS" at the right end of the kernel's "Adding ... swap" message (so that if it's both nonrot and discardable, "SSD" will be shown there). Perhaps something should be shown in /proc/swaps (swapon -s), but we have to be more cautious before making any addition to that format. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Joern Engel <joern@logfs.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Donjun Shin <djshin90@gmail.com> Cc: Tejun Heo <teheo@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	7992fde72c	swapfile: swap allocation use discard When scan_swap_map() finds a free cluster of swap pages to allocate, discard the old contents of the cluster if the device supports discard. But don't bother when swap is so fragmented that we allocate single pages. Be careful about racing allocations made while we're scanning for a cluster; and hold up allocations made while we're discarding. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Joern Engel <joern@logfs.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Donjun Shin <djshin90@gmail.com> Cc: Tejun Heo <teheo@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	6a6ba83175	swapfile: swapon use discard (trim) When adding swap, all the old data on swap can be forgotten: sys_swapon() discard all but the header page of the swap partition (or every extent but the header of the swap file), to give a solidstate swap device the opportunity to optimize its wear-levelling. If that succeeds, note SWP_DISCARDABLE for later use, and report it with a "D" at the right end of the kernel's "Adding ... swap" message. Perhaps something should be shown in /proc/swaps (swapon -s), but we have to be more cautious before making any addition to that format. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Joern Engel <joern@logfs.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Donjun Shin <djshin90@gmail.com> Cc: Tejun Heo <teheo@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	ebebbbe904	swapfile: rearrange scan and swap_info Before making functional changes, rearrange scan_swap_map() to simplify subsequent diffs. Actually, there is one functional change in there: leave cluster_nr negative while scanning for a new cluster - resetting it early increased the likelihood that when we have difficulty finding a free cluster, another task may come in and try doing exactly the same - just a waste of cpu. Before making functional changes, rearrange struct swap_info_struct slightly: flags will be needed as an unsigned long (for wait_on_bit), next is a good int to pair with prio, old_block_size is uninteresting so shift it to the end. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	22c6f8fdb3	swapfile: remove SWP_ACTIVE mask Remove the SWP_ACTIVE mask: it just obscures the SWP_WRITEOK flag. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
KOSAKI Motohiro	69beeb1d34	mm: make vread() and vwrite() declaration Sparse output following warnings. mm/vmalloc.c:1436:6: warning: symbol 'vread' was not declared. Should it be static? mm/vmalloc.c:1474:6: warning: symbol 'vwrite' was not declared. Should it be static? However, it is used by /dev/kmem. fixed here. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:05 -08:00
Hugh Dickins	b962716b45	mm: optimize get_scan_ratio for no swap Rik suggests a simplified get_scan_ratio() for !CONFIG_SWAP. Yes, the gcc optimizer gives us that, when nr_swap_pages is #defined as 0L. Move usual declaration to swapfile.c: it never belonged in page_alloc.c. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:04 -08:00
Hugh Dickins	60371d971a	mm: add add_to_swap stub If we add a failing stub for add_to_swap(), then we can remove the #ifdef CONFIG_SWAP from mm/vmscan.c. This was intended as a source cleanup, but looking more closely, it turns out that the !CONFIG_SWAP case was going to keep_locked for an anonymous page, whereas now it goes to the more suitable activate_locked, like the CONFIG_SWAP nr_swap_pages 0 case. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:04 -08:00
Hugh Dickins	ac47b003d0	mm: remove gfp_mask from add_to_swap Remove gfp_mask argument from add_to_swap(): it's misleading because its only caller, shrink_page_list(), is not atomic at that point; and in due course (implementing discard) we'll sometimes want to allocate some memory with GFP_NOIO (as is used in swap_writepage) when allocating swap. No change to the gfp_mask passed down to add_to_swap_cache(): still use __GFP_HIGH without __GFP_WAIT (with nomemalloc and nowarn as before): though it's not obvious if that's the best combination to ask for here. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:04 -08:00
Hugh Dickins	a2c43eed83	mm: try_to_free_swap replaces remove_exclusive_swap_page remove_exclusive_swap_page(): its problem is in living up to its name. It doesn't matter if someone else has a reference to the page (raised page_count); it doesn't matter if the page is mapped into userspace (raised page_mapcount - though that hints it may be worth keeping the swap): all that matters is that there be no more references to the swap (and no writeback in progress). swapoff (try_to_unuse) has been removing pages from swapcache for years, with no concern for page count or page mapcount, and we used to have a comment in lookup_swap_cache() recognizing that: if you go for a page of swapcache, you'll get the right page, but it could have been removed from swapcache by the time you get page lock. So, give up asking for exclusivity: get rid of remove_exclusive_swap_page(), and remove_exclusive_swap_page_ref() and remove_exclusive_swap_page_count() which were spawned for the recent LRU work: replace them by the simpler try_to_free_swap() which just checks page_swapcount(). Similarly, remove the page_count limitation from free_swap_and_count(), but assume that it's worth holding on to the swap if page is mapped and swap nowhere near full. Add a vm_swap_full() test in free_swap_cache()? It would be consistent, but I think we probably have enough for now. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:03 -08:00
Hugh Dickins	7b1fe59793	mm: reuse_swap_page replaces can_share_swap_page A good place to free up old swap is where do_wp_page(), or do_swap_page(), is about to redirty the page: the data on disk is then stale and won't be read again; and if we do decide to write the page out later, using the previous swap location makes an unnecessary disk seek very likely. So give can_share_swap_page() the side-effect of delete_from_swap_cache() when it safely can. And can_share_swap_page() was always a misleading name, the more so if it has a side-effect: rename it reuse_swap_page(). Irrelevant cleanup nearby: remove swap_token_default_timeout definition from swap.h: it's used nowhere. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:03 -08:00
David Rientjes	2da02997e0	mm: add dirty_background_bytes and dirty_bytes sysctls This change introduces two new sysctls to /proc/sys/vm: dirty_background_bytes and dirty_bytes. dirty_background_bytes is the counterpart to dirty_background_ratio and dirty_bytes is the counterpart to dirty_ratio. With growing memory capacities of individual machines, it's no longer sufficient to specify dirty thresholds as a percentage of the amount of dirtyable memory over the entire system. dirty_background_bytes and dirty_bytes specify quantities of memory, in bytes, that represent the dirty limits for the entire system. If either of these values is set, its value represents the amount of dirty memory that is needed to commence either background or direct writeback. When a `bytes' or `ratio' file is written, its counterpart becomes a function of the written value. For example, if dirty_bytes is written to be 8096, 8K of memory is required to commence direct writeback. dirty_ratio is then functionally equivalent to 8K / the amount of dirtyable memory: dirtyable_memory = free pages + mapped pages + file cache dirty_background_bytes = dirty_background_ratio * dirtyable_memory -or- dirty_background_ratio = dirty_background_bytes / dirtyable_memory AND dirty_bytes = dirty_ratio * dirtyable_memory -or- dirty_ratio = dirty_bytes / dirtyable_memory Only one of dirty_background_bytes and dirty_background_ratio may be specified at a time, and only one of dirty_bytes and dirty_ratio may be specified. When one sysctl is written, the other appears as 0 when read. The `bytes' files operate on a page size granularity since dirty limits are compared with ZVC values, which are in page units. Prior to this change, the minimum dirty_ratio was 5 as implemented by get_dirty_limits() although /proc/sys/vm/dirty_ratio would show any user written value between 0 and 100. This restriction is maintained, but dirty_bytes has a lower limit of only one page. Also prior to this change, the dirty_background_ratio could not equal or exceed dirty_ratio. This restriction is maintained in addition to restricting dirty_background_bytes. If either background threshold equals or exceeds that of the dirty threshold, it is implicitly set to half the dirty threshold. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:03 -08:00
David Rientjes	364aeb2849	mm: change dirty limit type specifiers to unsigned long The background dirty and dirty limits are better defined with type specifiers of unsigned long since negative writeback thresholds are not possible. These values, as returned by get_dirty_limits(), are normally compared with ZVC values to determine whether writeback shall commence or be throttled. Such page counts cannot be negative, so declaring the page limits as signed is unnecessary. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:02 -08:00
Hugh Dickins	2afd1c928f	mm: make page_lock_anon_vma() static page_lock_anon_vma() and page_unlock_anon_vma() were made available to show_page_path() in vmscan.c; but now that has been removed, make them static in rmap.c again, they're better kept private if possible. Signed-off-by: Hugh Dickins <hugh@veritas.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:02 -08:00
Hugh Dickins	b5934c5318	mm: add_active_or_unevictable into rmap lru_cache_add_active_or_unevictable() and page_add_new_anon_rmap() always appear together. Save some symbol table space and some jumping around by removing lru_cache_add_active_or_unevictable(), folding its code into page_add_new_anon_rmap(): like how we add file pages to lru just after adding them to page cache. Remove the nearby "TODO: is this safe?" comments (yes, it is safe), and change page_add_new_anon_rmap()'s address BUG_ON to VM_BUG_ON as originally intended. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:02 -08:00
Hugh Dickins	6d91add09f	mm: add Set,ClearPageSwapCache stubs If we add NOOP stubs for SetPageSwapCache() and ClearPageSwapCache(), then we can remove the #ifdef CONFIG_SWAPs from mm/migrate.c. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:02 -08:00
Hugh Dickins	3c1d43787b	mm: remove GFP_HIGHUSER_PAGECACHE GFP_HIGHUSER_PAGECACHE is just an alias for GFP_HIGHUSER_MOVABLE, making that harder to track down: remove it, and its out-of-work brothers GFP_NOFS_PAGECACHE and GFP_USER_PAGECACHE. Since we're making that improvement to hotremove_migrate_alloc(), I think we can now also remove one of the "o"s from its comment. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:01 -08:00
Hugh Dickins	e5991371ee	mm: remove cgroup_mm_owner_callbacks cgroup_mm_owner_callbacks() was brought in to support the memrlimit controller, but sneaked into mainline ahead of it. That controller has now been shelved, and the mm_owner_changed() args were inadequate for it anyway (they needed an mm pointer instead of a task pointer). Remove the dead code, and restore mm_update_next_owner() locking to how it was before: taking mmap_sem there does nothing for memcontrol.c, now the only user of mm->owner. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Paul Menage <menage@google.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:01 -08:00
KOSAKI Motohiro	64cdd548ff	mm: cleanup: remove #ifdef CONFIG_MIGRATION #ifdef in *.c file decrease source readability a bit. removing is better. This patch doesn't have any functional change. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:00 -08:00
KOSAKI Motohiro	1b0bd11886	mm: get rid of pagevec_release_nonlru() speculative page references patch (commit: `e286781d5f`) removed last pagevec_release_nonlru() caller. So this function can be removed now. This patch doesn't have any functional change. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:00 -08:00
Gary Hade	c04fc586c1	mm: show node to memory section relationship with symlinks in sysfs Show node to memory section relationship with symlinks in sysfs Add /sys/devices/system/node/nodeX/memoryY symlinks for all the memory sections located on nodeX. For example: /sys/devices/system/node/node1/memory135 -> ../../memory/memory135 indicates that memory section 135 resides on node1. Also revises documentation to cover this change as well as updating Documentation/ABI/testing/sysfs-devices-memory to include descriptions of memory hotremove files 'phys_device', 'phys_index', and 'state' that were previously not described there. In addition to it always being a good policy to provide users with the maximum possible amount of physical location information for resources that can be hot-added and/or hot-removed, the following are some (but likely not all) of the user benefits provided by this change. Immediate: - Provides information needed to determine the specific node on which a defective DIMM is located. This will reduce system downtime when the node or defective DIMM is swapped out. - Prevents unintended onlining of a memory section that was previously offlined due to a defective DIMM. This could happen during node hot-add when the user or node hot-add assist script onlines _all_ offlined sections due to user or script inability to identify the specific memory sections located on the hot-added node. The consequences of reintroducing the defective memory could be ugly. - Provides information needed to vary the amount and distribution of memory on specific nodes for testing or debugging purposes. Future: - Will provide information needed to identify the memory sections that need to be offlined prior to physical removal of a specific node. Symlink creation during boot was tested on 2-node x86_64, 2-node ppc64, and 2-node ia64 systems. Symlink creation during physical memory hot-add tested on a 2-node x86_64 system. Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:00 -08:00
David Rientjes	75aa199410	oom: print triggering task's cpuset and mems allowed When cpusets are enabled, it's necessary to print the triggering task's set of allowable nodes so the subsequently printed meminfo can be interpreted correctly. We also print the task's cpuset name for informational purposes. [rientjes@google.com: task lock current before dereferencing cpuset] Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:58:59 -08:00
Nick Piggin	1c0fe6e3bd	mm: invoke oom-killer from page fault Rather than have the pagefault handler kill a process directly if it gets a VM_FAULT_OOM, have it call into the OOM killer. With increasingly sophisticated oom behaviour (cpusets, memory cgroups, oom killing throttling, oom priority adjustment or selective disabling, panic on oom, etc), it's silly to unconditionally kill the faulting process at page fault time. Create a hook for pagefault oom path to call into instead. Only converted x86 and uml so far. [akpm@linux-foundation.org: make __out_of_memory() static] [akpm@linux-foundation.org: fix comment] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jeff Dike <jdike@addtoit.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:58:58 -08:00
Mel Gorman	3340289ddf	mm: report the MMU pagesize in /proc/pid/smaps The KernelPageSize entry in /proc/pid/smaps is the pagesize used by the kernel to back a VMA. This matches the size used by the MMU in the majority of cases. However, one counter-example occurs on PPC64 kernels whereby a kernel using 64K as a base pagesize may still use 4K pages for the MMU on older processor. To distinguish, this patch reports MMUPageSize as the pagesize used by the MMU in /proc/pid/smaps. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: "KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:58:58 -08:00
Mel Gorman	08fba69986	mm: report the pagesize backing a VMA in /proc/pid/smaps It is useful to verify a hugepage-aware application is using the expected pagesizes for its memory regions. This patch creates an entry called KernelPageSize in /proc/pid/smaps that is the size of page used by the kernel to back a VMA. The entry is not called PageSize as it is possible the MMU uses a different size. This extension should not break any sensible parser that skips lines containing unrecognised information. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: "KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:58:58 -08:00
James Morris	ac8cc0fa53	Merge branch 'next' into for-linus	2009-01-07 09:58:22 +11:00
David Howells	3699c53c48	CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3 ] Fix a regression in cap_capable() due to: commit `3b11a1dece` Author: David Howells <dhowells@redhat.com> Date: Fri Nov 14 10:39:26 2008 +1100 CRED: Differentiate objective and effective subjective credentials on a task The problem is that the above patch allows a process to have two sets of credentials, and for the most part uses the subjective credentials when accessing current's creds. There is, however, one exception: cap_capable(), and thus capable(), uses the real/objective credentials of the target task, whether or not it is the current task. Ordinarily this doesn't matter, since usually the two cred pointers in current point to the same set of creds. However, sys_faccessat() makes use of this facility to override the credentials of the calling process to make its test, without affecting the creds as seen from other processes. One of the things sys_faccessat() does is to make an adjustment to the effective capabilities mask, which cap_capable(), as it stands, then ignores. The affected capability check is in generic_permission(): if (!(mask & MAY_EXEC) \|\| execute_ok(inode)) if (capable(CAP_DAC_OVERRIDE)) return 0; This change passes the set of credentials to be tested down into the commoncap and SELinux code. The security functions called by capable() and has_capability() select the appropriate set of credentials from the process being checked. This can be tested by compiling the following program from the XFS testsuite: /* * t_access_root.c - trivial test program to show permission bug. * * Written by Michael Kerrisk - copyright ownership not pursued. * Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html / #include <limits.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <sys/stat.h> #define UID 500 #define GID 100 #define PERM 0 #define TESTPATH "/tmp/t_access" static void errExit(char msg) { perror(msg); exit(EXIT_FAILURE); } /* errExit / static void accessTest(char file, int mask, char mstr) { printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask)); } / accessTest / int main(int argc, char argv[]) { int fd, perm, uid, gid; char testpath; char cmd[PATH_MAX + 20]; testpath = (argc > 1) ? argv[1] : TESTPATH; perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM; uid = (argc > 3) ? atoi(argv[3]) : UID; gid = (argc > 4) ? atoi(argv[4]) : GID; unlink(testpath); fd = open(testpath, O_RDWR \| O_CREAT, 0); if (fd == -1) errExit("open"); if (fchown(fd, uid, gid) == -1) errExit("fchown"); if (fchmod(fd, perm) == -1) errExit("fchmod"); close(fd); snprintf(cmd, sizeof(cmd), "ls -l %s", testpath); system(cmd); if (seteuid(uid) == -1) errExit("seteuid"); accessTest(testpath, 0, "0"); accessTest(testpath, R_OK, "R_OK"); accessTest(testpath, W_OK, "W_OK"); accessTest(testpath, X_OK, "X_OK"); accessTest(testpath, R_OK \| W_OK, "R_OK \| W_OK"); accessTest(testpath, R_OK \| X_OK, "R_OK \| X_OK"); accessTest(testpath, W_OK \| X_OK, "W_OK \| X_OK"); accessTest(testpath, R_OK \| W_OK \| X_OK, "R_OK \| W_OK \| X_OK"); exit(EXIT_SUCCESS); } / main */ This can be run against an Ext3 filesystem as well as against an XFS filesystem. If successful, it will show: [root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043 ---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx access(/tmp/xxx, 0) returns 0 access(/tmp/xxx, R_OK) returns 0 access(/tmp/xxx, W_OK) returns 0 access(/tmp/xxx, X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK) returns 0 access(/tmp/xxx, R_OK \| X_OK) returns -1 access(/tmp/xxx, W_OK \| X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK \| X_OK) returns -1 If unsuccessful, it will show: [root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043 ---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx access(/tmp/xxx, 0) returns 0 access(/tmp/xxx, R_OK) returns -1 access(/tmp/xxx, W_OK) returns -1 access(/tmp/xxx, X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK) returns -1 access(/tmp/xxx, R_OK \| X_OK) returns -1 access(/tmp/xxx, W_OK \| X_OK) returns -1 access(/tmp/xxx, R_OK \| W_OK \| X_OK) returns -1 I've also tested the fix with the SELinux and syscalls LTP testsuites. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-01-07 09:38:48 +11:00
James Morris	29881c4502	Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2 ]" This reverts commit `14eaddc967`. David has a better version to come.	2009-01-07 09:21:54 +11:00
Oliver Hartkopp	1fa17d4ba4	can: omit unneeded skb_clone() calls The AF_CAN core delivered always cloned sk_buffs to the AF_CAN protocols, although this was _only_ needed by the can-raw protocol. With this (additionally documented) change, the AF_CAN core calls the callback functions of the registered AF_CAN protocols with the original (uncloned) sk_buff pointer and let's the can-raw protocol do the skb_clone() itself which omits all unneeded skb_clone() calls for other AF_CAN protocols. Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-06 11:07:54 -08:00
Herbert Xu	e1c096e251	vlan: Add GRO interfaces This patch adds GRO interfaces for hardware-assisted VLAN reception. With this in place we're now at parity with LRO as far as the interface is concerned. That is, you can now take any LRO driver and convert it over to GRO. As the CB memory clashes with GRO's use of CB, I've removed it entirely by storing dev in skb->dev. This is OK because VLAN gets called first thing in netif_receive_skb and skb->dev is not used in between us calling netif_rx and netif_receive_skb getting called. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-06 10:50:09 -08:00
Herbert Xu	96e93eab20	gro: Add internal interfaces for VLAN Previously GRO's only entry point from the outside is through napi_gro_receive and napi_gro_frags. These interfaces are for device drivers. This patch rearranges things to provide a new set of interfaces for VLANs. These interfaces are for internal use only. The VLAN code itself can then provide a set of entry points for device drivers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-06 10:49:34 -08:00
Stephen Rothwell	b8ac9fc0e8	uio: make uio_info's name and version const These are only ever assigned constant strings and never modified. This was noticed because Wolfram Sang needed to cast the result of of_get_property() in order to assign it to the name field of a struct uio_info. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Hans J. Koch <hjk@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:44 -08:00
Hans J. Koch	e70c412ee4	UIO: Pass information about ioports to userspace (V2) Devices sometimes have memory where all or parts of it can not be mapped to userspace. But it might still be possible to access this memory from userspace by other means. An example are PCI cards that advertise not only mappable memory but also ioport ranges. On x86 architectures, these can be accessed with ioperm, iopl, inb, outb, and friends. Mike Frysinger (CCed) reported a similar problem on Blackfin arch where it doesn't seem to be easy to mmap non-cached memory but it can still be accessed from userspace. This patch allows kernel drivers to pass information about such ports to userspace. Similar to the existing mem[] array, it adds a port[] array to struct uio_info. Each port range is described by start, size, and porttype. If a driver fills in at least one such port range, the UIO core will simply pass this information to userspace by creating a new directory "portio" underneath /sys/class/uio/uioN/. Similar to the "mem" directory, it will contain a subdirectory (portX) for each port range given. Note that UIO simply passes this information to userspace, it performs no action whatsoever with this data. It's userspace's responsibility to obtain access to these ports and to solve arch dependent issues. The "porttype" attribute tells userspace what kind of port it is dealing with. This mechanism could also be used to give userspace information about GPIOs related to a device. You frequently find such hardware in embedded devices, so I added a UIO_PORT_GPIO definition. I'm not really sure if this is a good idea since there are other solutions to this problem, but it won't hurt much anyway. Signed-off-by: Hans J. Koch <hjk@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:44 -08:00
Kay Sievers	475b44c199	mtd: struct device - replace bus_id with dev_name(), dev_set_name() CC: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:38 -08:00
Mark McLoughlin	0aa0dc41bf	driver core: add root_device_register() Add support for allocating root device objects which group device objects under /sys/devices directories. Also add a sysfs 'module' symlink which points to the owner of the root device object. This symlink will be used in virtio to allow userspace to determine which virtio bus implementation a given device is associated with. [Includes suggestions from Cornelia Huck] Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:33 -08:00
Cornelia Huck	d0d85ff989	Make DEBUG take precedence over DYNAMIC_PRINTK_DEBUG Statically defined DEBUG should take precedence over dynamically enabled debugging; otherwise adding DEBUG (like, for example, via CONFIG_DEBUG_KOBJECT) does not have the expected result of printing pr_debug() and dev_dbg() messages unconditionally. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:33 -08:00
Greg Kroah-Hartman	b9daa99ee5	driver core: move knode_bus into private structure Nothing outside of the driver core should ever touch knode_bus, so move it out of the public eye. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:33 -08:00
Greg Kroah-Hartman	93e746db18	driver core: move knode_driver into private structure Nothing outside of the driver core should ever touch knode_driver, so move it out of the public eye. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:32 -08:00
Greg Kroah-Hartman	11c3b5c3e0	driver core: move klist_children into private structure Nothing outside of the driver core should ever touch klist_children, or knode_parent, so move them out of the public eye. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:32 -08:00
Greg Kroah-Hartman	2831fe6f9c	driver core: create a private portion of struct device This is to be used to move things out of struct device that no code outside of the driver core should ever touch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:32 -08:00
Matthew Wilcox	210272a284	driver core: Remove completion from struct klist_node Removing the completion from klist_node reduces its size from 64 bytes to 28 on x86-64. To maintain the semantics of klist_remove(), we add a single list of klist nodes which are pending deletion and scan them. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:30 -08:00
Matthew Wilcox	929d2fa595	driver core: Rearrange struct device for better packing This minor rearrangement saves 16 bytes from sizeof(struct device) according to pahole. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:30 -08:00
Alan Stern	7f4f5d4516	Fix misspellings in pm.h macros This patch (as1167) fixes some misspellings in various recently-added macros in pm.h. Fortunately these macros are not yet used anywhere. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Rafael J. Wysocki <rjw@sisk.pl>	2009-01-06 10:44:30 -08:00
Rafael J. Wysocki	adf094931f	PM: Simplify the new suspend/hibernation framework for devices PM: Simplify the new suspend/hibernation framework for devices Following the discussion at the Kernel Summit, simplify the new device PM framework by merging 'struct pm_ops' and 'struct pm_ext_ops' and removing pointers to 'struct pm_ext_ops' from 'struct platform_driver' and 'struct pci_driver'. After this change, the suspend/hibernation callbacks will only reside in 'struct device_driver' as well as at the bus type/ device class/device type level. Accordingly, PCI and platform device drivers are now expected to put their suspend/hibernation callbacks into the 'struct device_driver' embedded in 'struct pci_driver' or 'struct platform_driver', respectively. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-01-06 10:44:29 -08:00
Dan Williams	864498aaa9	dmaengine: use idr for registering dma device numbers This brings some predictability to dma device numbers, i.e. an rmmod/insmod cycle may now result in /sys/class/dma/dma0chan0 being restored rather than /sys/class/dma/dma1chan0 appearing. Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:21 -07:00
Dan Williams	41d5e59c12	dmaengine: add a release for dma class devices and dependent infrastructure Resolves: WARNING: at drivers/base/core.c:122 device_release+0x4d/0x52() Device 'dma0chan0' does not have a release() function, it is broken and must be fixed. The dma_chan_dev object is introduced to gear-match sysfs kobject and dmaengine channel lifetimes. When a channel is removed access to the sysfs entries return -ENODEV until the kobject can be released. The bulk of the change is updates to existing code to handle the extra layer of indirection between a dma_chan and its struct device. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:21 -07:00
Dan Williams	7dd6025101	dmaengine: kill enum dma_state_client DMA_NAK is now useless. We can just use a bool instead. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:19 -07:00
Dan Williams	f27c580c36	dmaengine: remove 'bigref' infrastructure Reference counting is done at the module level so clients need not worry that a channel will leave while they are actively using dmaengine. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:18 -07:00
Dan Williams	aa1e6f1a38	dmaengine: kill struct dma_client and supporting infrastructure All users have been converted to either the general-purpose allocator, dma_find_channel, or dma_request_channel. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:17 -07:00
Dan Williams	209b84a88f	dmaengine: replace dma_async_client_register with dmaengine_get Now that clients no longer need to be notified of channel arrival dma_async_client_register can simply increment the dmaengine_ref_count. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:17 -07:00
Dan Williams	74465b4ff9	atmel-mci: convert to dma_request_channel and down-level dma_slave dma_request_channel provides an exclusive channel, so we no longer need to pass slave data through dmaengine. Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:16 -07:00
Dan Williams	33df8ca068	dmatest: convert to dma_request_channel Replace the client registration infrastructure with a custom loop to poll for channels. Once dma_request_channel returns NULL stop asking for channels. A userspace side effect of this change if that loading the dmatest module before loading a dma driver will result in no channels being found, previously dmatest would get a callback. To facilitate testing in the built-in case dmatest_init is marked as a late_initcall. Another side effect is that channels under test can not be used for any other purpose. Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:15 -07:00
Dan Williams	59b5ec2144	dmaengine: introduce dma_request_channel and private channels This interface is primarily for device-to-memory clients which need to search for dma channels with platform-specific characteristics. The prototype is: struct dma_chan dma_request_channel(dma_cap_mask_t mask, dma_filter_fn filter_fn, void filter_param); When the optional 'filter_fn' parameter is set to NULL dma_request_channel simply returns the first channel that satisfies the capability mask. Otherwise, when the mask parameter is insufficient for specifying the necessary channel, the filter_fn routine can be used to disposition the available channels in the system. The filter_fn routine is called once for each free channel in the system. Upon seeing a suitable channel filter_fn returns DMA_ACK which flags that channel to be the return value from dma_request_channel. A channel allocated via this interface is exclusive to the caller, until dma_release_channel() is called. To ensure that all channels are not consumed by the general-purpose allocator the DMA_PRIVATE capability is provided to exclude a dma_device from general-purpose (memory-to-memory) consideration. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:15 -07:00
Dan Williams	f67b459992	net_dma: convert to dma_find_channel Use the general-purpose channel allocation provided by dmaengine. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:15 -07:00
Dan Williams	2ba05622b8	dmaengine: provide a common 'issue_pending_all' implementation async_tx and net_dma each have open-coded versions of issue_pending_all, so provide a common routine in dmaengine. The implementation needs to walk the global device list, so implement rcu to allow dma_issue_pending_all to run lockless. Clients protect themselves from channel removal events by holding a dmaengine reference. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:14 -07:00
Dan Williams	bec085134e	dmaengine: centralize channel allocation, introduce dma_find_channel Allowing multiple clients to each define their own channel allocation scheme quickly leads to a pathological situation. For memory-to-memory offload all clients can share a central allocator. This simply moves the existing async_tx allocator to dmaengine with minimal fixups: * async_tx.c:get_chan_ref_by_cap --> dmaengine.c:nth_chan * async_tx.c:async_tx_rebalance --> dmaengine.c:dma_channel_rebalance * split out common code from async_tx.c:__async_tx_find_channel --> dma_find_channel Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:14 -07:00
Dan Williams	6f49a57aa5	dmaengine: up-level reference counting to the module level Simply, if a client wants any dmaengine channel then prevent all dmaengine modules from being removed. Once the clients are done re-enable module removal. Why?, beyond reducing complication: 1/ Tracking reference counts per-transaction in an efficient manner, as is currently done, requires a complicated scheme to avoid cache-line bouncing effects. 2/ Per-transaction ref-counting gives the false impression that a dma-driver can be gracefully removed ahead of its user (net, md, or dma-slave) 3/ None of the in-tree dma-drivers talk to hot pluggable hardware, but if such an engine were built one day we still would not need to notify clients of remove events. The driver can simply return NULL to a ->prep() request, something that is much easier for a client to handle. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-06 11:38:14 -07:00
Chuck Lever	57ef692588	NLM: Rewrite IPv4 privileged requester's check Clean up. For consistency, rewrite the IPv4 check to match the same style as the new IPv6 check. Note that ipv4_is_loopback() is somewhat broader in its interpretation of what is a loopback address than simply "127.0.0.1". Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:56 -05:00
Chuck Lever	d1208f7073	NLM: nlm_privileged_requester() doesn't recognize mapped loopback address Commit `b85e4676` added the nlm_privileged_requester() helper to check whether an RPC request was sent from a local privileged caller. It recognizes IPv4 privileged callers (from "127.0.0.1"), and IPv6 privileged callers (from "::1"). However, IPV6_ADDR_LOOPBACK is not set for the mapped IPv4 loopback address (::ffff:7f00:0001), so the test breaks when the kernel's RPC service is IPv6-enabled but user space is calling via the IPv4 loopback address. This is actually the most common case for IPv6- enabled RPC services on Linux. Rewrite the IPv6 check to handle the mapped IPv4 loopback address as well as a normal IPv6 loopback address. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:56 -05:00
Chuck Lever	8529bc51d3	NSM: Move nsm_addr() to fs/lockd/mon.c Clean up: nsm_addr_in() is no longer used, and nsm_addr() is used only in fs/lockd/mon.c, so move it there. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:55 -05:00
Chuck Lever	e6765b8397	NSM: Remove include/linux/lockd/sm_inter.h Clean up: The include/linux/lockd/sm_inter.h header is nearly empty now. Remove it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:55 -05:00
Chuck Lever	92fd91b998	NLM: Remove "create" argument from nsm_find() Clean up: nsm_find() now has only one caller, and that caller unconditionally sets the @create argument. Thus the @create argument is no longer needed. Since nsm_find() now has a more specific purpose, pick a more appropriate name for it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:54 -05:00
Chuck Lever	3420a8c435	NSM: Add nsm_lookup() function Introduce a new API to fs/lockd/mon.c that allows nlm_host_rebooted() to lookup up nsm_handles via the contents of an nlm_reboot struct. The new function is equivalent to calling nsm_find() with @create set to zero, but it takes a struct nlm_reboot instead of separate arguments. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:54 -05:00
Chuck Lever	576df4634e	NLM: Decode "priv" argument of NLMPROC_SM_NOTIFY as an opaque The NLM XDR decoders for the NLMPROC_SM_NOTIFY procedure should treat their "priv" argument truly as an opaque, as defined by the protocol, and let the upper layers figure out what is in it. This will make it easier to modify the contents and interpretation of the "priv" argument, and keep knowledge about what's in "priv" local to fs/lockd/mon.c. For now, the NLM and NSM implementations should behave exactly as they did before. The formation of the address of the rebooted host in nlm_host_rebooted() may look a little strange, but it is the inverse of how nsm_init_private() forms the private cookie. Plus, it's going away soon anyway. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:54 -05:00
Chuck Lever	7fefc9cb9d	NLM: Change nlm_host_rebooted() to take a single nlm_reboot argument Pass the nlm_reboot data structure directly from the NLMPROC_SM_NOTIFY XDR decoders to nlm_host_rebooted(). This eliminates some packing and unpacking of the NLMPROC_SM_NOTIFY results, and prepares for passing these results, including the "priv" cookie, directly to a lookup routine in fs/lockd/mon.c. This patch changes code organization but should not cause any behavioral change. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:54 -05:00
Chuck Lever	7e44d3bea2	NSM: Generate NSMPROC_MON's "priv" argument when nsm_handle is created Introduce a new data type, used by both the in-kernel NLM and NSM implementations, that is used to manage the opaque "priv" argument for the NSMPROC_MON and NLMPROC_SM_NOTIFY calls. Construct the "priv" cookie when the nsm_handle is created. The nsm_init_private() function may look a little strange, but it is roughly equivalent to how the XDR encoder formed the "priv" argument. It's going to go away soon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:53 -05:00
Chuck Lever	67c6d107a6	NSM: Move nsm_find() to fs/lockd/mon.c The nsm_find() function sets up fresh nsm_handle entries. This is where we will store the "priv" cookie used to lookup nsm_handles during reboot recovery. The cookie will be constructed when nsm_find() creates a new nsm_handle. As much as possible, I would like to keep everything that handles a "priv" cookie in fs/lockd/mon.c so that all the smarts are in one source file. That organization should make it pretty simple to see how all this works. To me, it makes more sense than the current arrangement to keep nsm_find() with nsm_monitor() and nsm_unmonitor(). So, start reorganizing by moving nsm_find() into fs/lockd/mon.c. The nsm_release() function comes along too, since it shares the nsm_lock global variable. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:53 -05:00
Chuck Lever	36e8e668d3	NSM: Move NSM program and procedure numbers to fs/lockd/mon.c Clean up: Move the RPC program and procedure numbers for NSM into the one source file that needs them: fs/lockd/mon.c. And, as with NLM, NFS, and rpcbind calls, use NSMPROC_FOO instead of SM_FOO for NSM procedure numbers. Finally, make a couple of comments more precise: what is referred to here as SM_NOTIFY is really the NLM (lockd) NLMPROC_SM_NOTIFY downcall, not NSMPROC_NOTIFY. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	9c1bfd037f	NSM: Move NSM-related XDR data structures to lockd's xdr.h Clean up: NSM's XDR data structures are used only in fs/lockd/mon.c, so move them there. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	356c3eb466	NLM: Move the public declaration of nsm_unmonitor() to lockd.h Clean up. Make the nlm_host argument "const," and move the public declaration to lockd.h. Add a documenting comment. Bruce observed that nsm_unmonitor()'s only caller doesn't care about its return code, so make nsm_unmonitor() return void. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	c8c23c423d	NSM: Release nsmhandle in nlm_destroy_host The nsm_handle's reference count is bumped in nlm_lookup_host(). It should be decremented in nlm_destroy_host() to make it easier to see the balance of these two operations. Move the nsm_release() call to fs/lockd/host.c. The h_nsmhandle pointer is set in nlm_lookup_host(), and never cleared. The nlm_destroy_host() function is never called for the same nlm_host twice, so h_nsmhandle won't ever be NULL when nsm_unmonitor() is called. All references to the nlm_host are gone before it is freed. We can skip making h_nsmhandle NULL just before the nlm_host is deallocated. It's also likely we can remove the h_nsmhandle NULL check in nlmsvc_is_client() as well, but we can do that later when rearchitect- ing the nlm_host cache. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	1e49323c4a	NLM: Move the public declaration of nsm_monitor() to lockd.h Clean up. Make the nlm_host argument "const," and move the public declaration to lockd.h with other NSM public function (nsm_release, eg) and global variable declarations. Add a documenting comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:52 -05:00
Chuck Lever	29ed1407ed	NSM: Support IPv6 version of mon_name The "mon_name" argument of the NSMPROC_MON and NSMPROC_UNMON upcalls is a string that contains the hostname or IP address of the remote peer to be notified when this host has rebooted. The sm-notify command uses this identifier to contact the peer when we reboot, so it must be either a well-qualified DNS hostname or a presentation format IP address string. When the "nsm_use_hostnames" sysctl is set to zero, the kernel's NSM provides a presentation format IP address in the "mon_name" argument. Otherwise, the "caller_name" argument from NLM requests is used, which is usually just the DNS hostname of the peer. To support IPv6 addresses for the mon_name argument, we use the nsm_handle's address eye-catcher, which already contains an appropriate presentation format address string. Using the eye-catcher string obviates the need to use a large buffer on the stack to form the presentation address string for the upcall. This patch also addresses a subtle bug. An NSMPROC_MON request and the subsequent NSMPROC_UNMON request for the same peer are required to use the same value for the "mon_name" argument. Otherwise, rpc.statd's NSMPROC_UNMON processing cannot locate the database entry for that peer and remove it. If the setting of nsm_use_hostnames is changed between the time the kernel sends an NSMPROC_MON request and the time it sends the NSMPROC_UNMON request for the same peer, the "mon_name" argument for these two requests may not be the same. This is because the value of "mon_name" is currently chosen at the moment the call is made based on the setting of nsm_use_hostnames To ensure both requests pass identical contents in the "mon_name" argument, we now select which string to use for the argument in the nsm_monitor() function. A pointer to this string is saved in the nsm_handle so it can be used for a subsequent NSMPROC_UNMON upcall. NB: There are other potential problems, such as how nlm_host_rebooted() might behave if nsm_use_hostnames were changed while hosts are still being monitored. This patch does not attempt to address those problems. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:51 -05:00
Chuck Lever	f47534f7f0	NSM: Use modern style for sm_name field in nsm_handle Clean up: I'm about to add another "char " field to the nsm_handle structure. The sm_name field uses an older style of declaring a "char " field. If I match that style for the new field, checkpatch.pl will complain. So, fix the sm_name field to use the new style. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:50 -05:00
Chuck Lever	bc995801a0	NLM: Support IPv6 scope IDs in nlm_display_address() Scope ID support is needed since the kernel's NSM implementation is about to use these displayed addresses as a mon_name in some cases. When nsm_use_hostnames is zero, without scope ID support NSM will fail to handle peers that contact us via a link-local address. Link-local addresses do not work without an interface ID, which is stored in the sockaddr's sin6_scope_id field. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:49 -05:00
Chuck Lever	1df40b609a	NLM: Remove address eye-catcher buffers from nlm_host The h_name field in struct nlm_host is a just copy of h_nsmhandle->sm_name. Likewise, the contents of the h_addrbuf field should be identical to the sm_addrbuf field. The h_srcaddrbuf field is used only in one place for debugging. We can live without this until we get %pI formatting for printk(). Currently these buffers are 48 bytes, but we need to support scope IDs in IPv6 presentation addresses, which means making the buffers even larger. Instead, let's find ways to eliminate them to save space. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:49 -05:00
Chuck Lever	7538ce1eb6	NLM: Use modern style for pointer fields in nlm_host Clean up: I'm about to add another "char " field to the nlm_host structure. The h_name field, for example, uses an older style of declaring a "char " field. If I match that style for the new field, checkpatch.pl will complain. So, fix pointer fields to use the new style. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:48 -05:00
Jeff Layton	c9233eb7b0	sunrpc: add sv_maxconn field to svc_serv (try #3 ) svc_check_conn_limits() attempts to prevent denial of service attacks by having the service close old connections once it reaches a threshold. This threshold is based on the number of threads in the service: (serv->sv_nrthreads + 3) * 20 Once we reach this, we drop the oldest connections and a printk pops to warn the admin that they should increase the number of threads. Increasing the number of threads isn't an option however for services like lockd. We don't want to eliminate this check entirely for such services but we need some way to increase this limit. This patch adds a sv_maxconn field to the svc_serv struct. When it's set to 0, we use the current method to calculate the max number of connections. RPC services can then set this on an as-needed basis. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:47 -05:00
J. Bruce Fields	548eaca46b	nfsd: document new filehandle fsid types Descriptions taken from mountd code (in nfs-utils/utils/mountd/cache.c). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:47 -05:00
Sergei Shtylyov	592b531521	ide: move read_sff_dma_status() method to 'struct ide_dma_ops' Move apparently misplaced read_sff_dma_status() method from 'struct ide_tp_ops' to 'struct ide_dma_ops', renaming it to dma_sff_read_status() and making only required for SFF-8038i compatible IDE controller drivers (greatly cutting down the number of initializers) as its only user (outside ide-dma-sff.c and such drivers) appears to be ide_pci_check_simplex() which is only called for such controllers... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:21:02 +01:00
Shane McDonald	391ad1908a	Resurrect IT8172 IDE controller driver Support for the IT8172 IDE controller was removed from the kernel sometime after 2.6.18. Support for the only boards that used the IT8172 was removed from the kernel after 2.6.18, as they had never compiled since 2.6.0. However, there are a couple of platforms that use this chip: the PMC-Sierra Xiao Hu thin-client computer, which is no longer in production, and the Linksys NSS4000 Network Attached Storage box, which is based on the Xiao Hu board. I am attempting to add support for the Xiao Hu to the kernel, and this IT8172 IDE controller is the first bit of code in this effort. This patch resurrects the IT8172 IDE controller code. I began with the 2.6.18 version of the it8172.c file, and have moved it forward so that it works with the latest version of the kernel. I have run this driver on a PMC-Sierra Xiao Hu board with the 2.6.28 kernel, and I have had no problems with it in my configuration. The attached patch applies cleanly against 2.6.28. Signed-off-by: Shane McDonald <mcdonald.shane@gmail.com> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: alan@lxorguk.ukuu.org.uk [bart: s/HWIF(drive)/drive->hwif/] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:21:01 +01:00
Bartlomiej Zolnierkiewicz	94c96445f3	ide: remove unused ide_hwif_t.sg_mapped field Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:59 +01:00
Bartlomiej Zolnierkiewicz	906ef986a7	ide: struct ide_atapi_pc - remove unused fields and update documentation Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:59 +01:00
Borislav Petkov	d6251d4488	ide-cd: convert to ide-atapi facilities ... and remove no longer needed cdrom_start_packet_command and cdrom_transfer_packet_command. Tested lightly with ide-cd and ide-floppy. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:58 +01:00
Bartlomiej Zolnierkiewicz	2bd24a1cfc	ide: add port and host iterators Add ide_port_for_each_dev() / ide_host_for_each_port() iterators and update IDE code to use them. While at it: - s/unit/i/ variable in ide_port_wait_ready(), ide_probe_port(), ide_port_tune_devices(), ide_port_init_devices_data(), do_reset1(), ide_acpi_set_state() and scc_dma_end() - s/d/i/ variable in ide_proc_port_register_devices() There should be no functional changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:56 +01:00
Bartlomiej Zolnierkiewicz	5e7f3a4669	ide: dynamic allocation of device structures Allocate device structures dynamically instead of having them embedded in ide_hwif_t: * Remove needless zeroing of port structure from ide_init_port_data(). * Add ide_hwif_t.devices[MAX_DRIVES] (table of pointers to the devices). * Add ide_port_{alloc,free}_devices() helpers and use them respectively in ide_{host,free}_alloc(). * Convert all users of ->drives[] to use ->devices[] instead. While at it: * Use drive->dn for the slave device check in scc_pata.c. As a nice side-effect this patch cuts ~1kB (x86-32) from the resulting code size: text data bss dec hex filename 53963 1244 237 55444 d894 drivers/ide/ide-core.o.before 52981 1244 237 54462 d4be drivers/ide/ide-core.o.after Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:56 +01:00
Bartlomiej Zolnierkiewicz	627e05daa1	ide: remove ->error method from struct ide_driver * Remove (now superfluous) ->error method from struct ide_driver. * Unexport __ide_error() and make it static. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:54 +01:00
Bartlomiej Zolnierkiewicz	7f3c868ba7	ide: remove ide_driver_t typedef While at it: - s/struct ide_driver_s/struct ide_driver/ - use to_ide_driver() macro in ide-proc.c Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:53 +01:00
Bartlomiej Zolnierkiewicz	9892ec5497	ide: remove 'byte' typedef Just use u8 instead, also s/__u8/u8/ in ide-cd.h while at it. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:53 +01:00
Bartlomiej Zolnierkiewicz	c0ae502347	ide: remove ide_pci_enablebit_t typedef Remove needless parens while at it. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:52 +01:00
Bartlomiej Zolnierkiewicz	54cc1428cf	ide: remove local_irq_set() macro Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:52 +01:00
Bartlomiej Zolnierkiewicz	898ec223fe	ide: remove HWIF() macro Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:52 +01:00
Bartlomiej Zolnierkiewicz	b40d1b88f1	ide: move ide_init_port_data() and friends to ide-probe.c * Move IDE_DEFAULT_MAX_FAILURES to <linux/ide.h>. * Move ide_cfg_mtx, ide_hwif_to_major[], ide_port_init_devices_data(), ide_init_port_data(), ide_init_port_hw() and ide_unregister() to ide-probe.c from ide.c. * Make ide_unregister(), ide_init_port_data(), ide_init_port_hw() and ide_cfg_mtx static. While at it: * Remove stale ide_init_port_data() documentation and ide_lock extern. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:51 +01:00
Bartlomiej Zolnierkiewicz	b65fac32cf	ide: merge ide_hwgroup_t with ide_hwif_t (v2) * Merge ide_hwgroup_t with ide_hwif_t. * Cleanup init_irq() accordingly, then remove no longer needed ide_remove_port_from_hwgroup() and ide_ports[]. * Remove now unused HWGROUP() macro. While at it: * ide_dump_ata_error() fixups v2: * Fix ->quirk_list check in do_ide_request() (s/hwif->cur_dev/prev_port->cur_dev). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:50 +01:00
Bartlomiej Zolnierkiewicz	5b31f855f1	ide: use lock bitops for ports serialization (v2) * Add ->host_busy field to struct ide_host and use it's first bit together with lock bitops to provide new ports serialization method. * Convert core IDE code to use new ide_[un]lock_host() helpers. This removes the need for taking hwgroup->lock if host is already busy on serialized hosts and makes it possible to merge ide_hwgroup_t into ide_hwif_t (done in the later patch). * Remove no longer needed ide_hwgroup_t.busy and ide_[un]lock_hwgroup(). * Update do_ide_request() documentation. v2: * ide_release_lock() should be called inside IDE_HFLAG_SERIALIZE check. * Add ide_hwif_t.busy flag and ide_[un]lock_port() for serializing devices on a port. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:49 +01:00
Bartlomiej Zolnierkiewicz	efe0397eef	ide: remove hwgroup->hwif and {drive,hwif}->next * Add 'int port_count' field to ide_hwgroup_t to keep the track of the number of ports in the hwgroup. Then update init_irq() and ide_remove_port_from_hwgroup() to use it. * Remove no longer needed hwgroup->hwif, {drive,hwif}->next, ide_add_drive_to_hwgroup() and ide_remove_drive_from_hwgroup() (hwgroup->drive now only denotes the currently active device in the hwgroup). * Update locking documentation in <linux/ide.h>. While at it: * Rename ->drive field in ide_hwgroup_t to ->cur_dev. * Use __func__ in ide_timer_expiry(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:49 +01:00
Bartlomiej Zolnierkiewicz	ae86afaee6	ide: use per-port IRQ handlers Use hwif instead of hwgroup as {request,free}_irq()'s cookie, teach ide_intr() to return early for non-active serialized ports, modify unexpected_intr() accordingly and then use per-port IRQ handlers instead of per-hwgroup ones. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:48 +01:00
Bartlomiej Zolnierkiewicz	bd53cbcce5	ide: add ->cur_port to struct ide_host and use it for serialized hosts * Pass 'ide_hwif_t ' instead of 'ide_hwgroup_t ' to unexpected_intr(). * Cache pointer to the port currently being serviced in ->cur_port and use it instead of hwif->hwgroup on serialized hosts. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-01-06 17:20:48 +01:00
Frederik Schwarzer	0211a9c850	trivial: fix an -> a typos in documentation and comments It is always "an" if there is a vowel _spoken_ (not written). So it is: "an hour" (spoken vowel) but "a uniform" (spoken 'j') Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-01-06 11:28:07 +01:00
Frederik Schwarzer	025dfdafe7	trivial: fix then -> than typos in comments and documentation - (better, more, bigger ...) then -> (...) than Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-01-06 11:28:06 +01:00
Ingo Molnar	d9be28ea91	Merge branches 'sched/clock', 'sched/cleanups' and 'linus' into sched/urgent	2009-01-06 09:33:57 +01:00
Ingo Molnar	fdbc0450df	Merge branches 'core/futexes', 'core/locking', 'core/rcu' and 'linus' into core/urgent	2009-01-06 09:32:11 +01:00
Rusty Russell	835481d9bc	cpumask: convert struct cpufreq_policy to cpumask_var_t Impact: use new cpumask API to reduce memory usage This is part of an effort to reduce structure sizes for machines configured with large NR_CPUS. cpumask_t gets replaced by cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or struct cpumask * (large NR_CPUS). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Dave Jones <davej@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-06 09:05:31 +01:00
Theodore Ts'o	b3881f74b3	ext4: Add mount option to set kjournald's I/O priority Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Jens Axboe <jens.axboe@oracle.com>	2009-01-05 22:46:26 -05:00
Linus Torvalds	238c6d5483	Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm snapshot: extend exception store functions dm snapshot: split out exception store implementations dm snapshot: rename struct exception_store dm snapshot: separate out exception store interface dm mpath: move trigger_event to system workqueue dm: add name and uuid to sysfs dm table: rework reference counting dm: support barriers on simple devices dm request: extend target interface dm request: add caches dm ioctl: allow dm_copy_name_and_uuid to return only one field dm log: ensure log bitmap fits on log device dm log: move region_size validation dm log: avoid reinitialising io_req on every operation dm: consolidate target deregistration error handling dm raid1: fix error count dm log: fix dm_io_client leak on error paths dm snapshot: change yield to msleep dm table: drop reference at unbind	2009-01-05 19:20:59 -08:00
Andi Kleen	ab4c142488	dm: support barriers on simple devices Implement barrier support for single device DM devices This patch implements barrier support in DM for the common case of dm linear just remapping a single underlying device. In this case we can safely pass the barrier through because there can be no reordering between devices. NB. Any DM device might cease to support barriers if it gets reconfigured so code must continue to allow for a possible -EOPNOTSUPP on every barrier bio submitted. - agk Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-01-06 03:05:09 +00:00
Kiyoshi Ueda	7d76345da6	dm request: extend target interface This patch adds the following target interfaces for request-based dm. map_rq : for mapping a request rq_end_io : for finishing a request busy : for avoiding performance regression from bio-based dm. Target can tell dm core not to map requests now, and that may help requests in the block layer queue to be bigger by I/O merging. In bio-based dm, this behavior is done by device drivers managing the block layer queue. But in request-based dm, dm core has to do that since dm core manages the block layer queue. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-01-06 03:05:07 +00:00
Mikulas Patocka	10d3bd09a3	dm: consolidate target deregistration error handling Change dm_unregister_target to return void and use BUG() for error reporting. dm_unregister_target can only fail because of programming bug in the target driver. It can't fail because of user's behavior or disk errors. This patch changes unregister_target to return void and use BUG if someone tries to unregister non-registered target or unregister target that is in use. This patch removes code duplication (testing of error codes in all dm targets) and reports bugs in just one place, in dm_unregister_target. In some target drivers, these return codes were ignored, which could lead to a situation where bugs could be missed. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-01-06 03:04:58 +00:00
Linus Torvalds	8e128ce331	Merge branch 'for-next' of git://git.o-hand.com/linux-mfd * 'for-next' of git://git.o-hand.com/linux-mfd: (30 commits) mfd: Fix section mismatch in da903x mfd: move drivers/i2c/chips/menelaus.c to drivers/mfd mfd: move drivers/i2c/chips/tps65010.c to drivers/mfd mfd: dm355evm msp430 driver mfd: Add missing break from wm3850-core mfd: Add WM8351 support mfd: Support configurable numbers of DCDCs and ISINKs on WM8350 mfd: Handle missing WM8350 platform data mfd: Add WM8352 support mfd: Use irq_to_desc in twl4030 code power_supply: Add Dialog DA9030 battery charger driver mfd: Dialog DA9030 battery charger MFD driver mfd: Register WM8400 codec device mfd: Pass driver_data onto child devices mfd: Fix twl4030-core.c build error mfd: twl4030 regulator bug fixes mfd: twl4030: create some regulator devices mfd: twl4030: cleanup symbols and OMAP dependency mfd: twl4030: simplified child creation code power_supply: Add battery health reporting for WM8350 ...	2009-01-05 19:04:09 -08:00
Linus Torvalds	0bbb275358	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: module: convert to stop_machine_create/destroy. stop_machine: introduce stop_machine_create/destroy. parisc: fix module loading failure of large kernel modules module: fix module loading failure of large kernel modules for parisc module: fix warning of unused function when !CONFIG_PROC_FS kernel/module.c: compare symbol values when marking symbols as exported in /proc/kallsyms. remove CONFIG_KMOD	2009-01-05 19:03:39 -08:00
Linus Torvalds	0578c3b4d4	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: swiotlb: Don't include linux/swiotlb.h twice in lib/swiotlb.c intel-iommu: fix build error with INTR_REMAP=y and DMAR=n swiotlb: add missing __init annotations	2009-01-05 19:03:11 -08:00
Linus Torvalds	8606ab6d30	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (22 commits) HID: fix error condition propagation in hid-sony driver HID: fix reference count leak hidraw HID: add proper support for pensketch 12x9 tablet HID: don't allow DealExtreme usb-radio be handled by usb hid driver HID: fix default Kconfig setting for TopSpeed driver HID: driver for TopSeed Cyberlink quirky remote HID: make boot protocol drivers depend on EMBEDDED HID: avoid sparse warning in HID_COMPAT_LOAD_DRIVER HID: hiddev cleanup -- handle all error conditions properly HID: force feedback driver for GreenAsia 0x12 PID HID: switch specialized drivers from "default y" to !EMBEDDED HID: set proper dev.parent in hidraw HID: add dynids facility HID: use GFP_KERNEL in hid_alloc_buffers HID: usbhid, use usb_endpoint_xfer_int HID: move usbhid flags to usbhid.h HID: add n-trig digitizer support HID: add phys and name ioctls to hidraw HID: struct device - replace bus_id with dev_name(), dev_set_name() HID: automatically call usbhid_set_leds in usbhid driver ...	2009-01-05 18:53:34 -08:00
Linus Torvalds	c54febae99	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (27 commits) GFS2: Use DEFINE_SPINLOCK GFS2: Fix use-after-free bug on umount (try #2) Revert "GFS2: Fix use-after-free bug on umount" GFS2: Streamline alloc calculations for writes GFS2: Send useful information with uevent messages GFS2: Fix use-after-free bug on umount GFS2: Remove ancient, unused code GFS2: Move four functions from super.c GFS2: Fix bug in gfs2_lock_fs_check_clean() GFS2: Send some sensible sysfs stuff GFS2: Kill two daemons with one patch GFS2: Move gfs2_recoverd into recovery.c GFS2: Fix "truncate in progress" hang GFS2: Clean up & move gfs2_quotad GFS2: Add more detail to debugfs glock dumps GFS2: Banish struct gfs2_rgrpd_host GFS2: Move rg_free from gfs2_rgrpd_host to gfs2_rgrpd GFS2: Move rg_igeneration into struct gfs2_rgrpd GFS2: Banish struct gfs2_dinode_host GFS2: Move i_size from gfs2_dinode_host and rename it to i_disksize ...	2009-01-05 18:52:54 -08:00
Linus Torvalds	15b0669072	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (44 commits) qlge: Fix sparse warnings for tx ring indexes. qlge: Fix sparse warning regarding rx buffer queues. qlge: Fix sparse endian warning in ql_hw_csum_setup(). qlge: Fix sparse endian warning for inbound packet control block flags. qlge: Fix sparse warnings for byte swapping in qlge_ethool.c myri10ge: print MAC and serial number on probe failure pkt_sched: cls_u32: Fix locking in u32_change() iucv: fix cpu hotplug af_iucv: Free iucv path/socket in path_pending callback af_iucv: avoid left over IUCV connections from failing connects af_iucv: New error return codes for connect() net/ehea: bitops work on unsigned longs Revert "net: Fix for initial link state in 2.6.28" tcp: Kill extraneous SPLICE_F_NONBLOCK checks. tcp: don't mask EOF and socket errors on nonblocking splice receive dccp: Integrate the TFRC library with DCCP dccp: Clean up ccid.c after integration of CCID plugins dccp: Lockless integration of CCID congestion-control plugins qeth: get rid of extra argument after printk to dev_* conversion qeth: No large send using EDDP for HiperSockets. ...	2009-01-05 18:44:59 -08:00
Linus Torvalds	e9af797d75	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Fix on resume, now preserves user policy min/max. [CPUFREQ] Add Celeron Core support to p4-clockmod. [CPUFREQ] add to speedstep-lib additional fsb values for core processors [CPUFREQ] Disable sysfs ui for p4-clockmod. [CPUFREQ] p4-clockmod: reduce noise [CPUFREQ] clean up speedstep-centrino and reduce cpumask_t usage	2009-01-05 18:33:38 -08:00
Linus Torvalds	10cc04f5a0	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (138 commits) ocfs2: Access the right buffer_head in ocfs2_merge_rec_left. ocfs2: use min_t in ocfs2_quota_read() ocfs2: remove unneeded lvb casts ocfs2: Add xattr support checking in init_security ocfs2: alloc xattr bucket in ocfs2_xattr_set_handle ocfs2: calculate and reserve credits for xattr value in mknod ocfs2/xattr: fix credits calculation during index create ocfs2/xattr: Always updating ctime during xattr set. ocfs2/xattr: Remove extend_trans call and add its credits from the beginning ocfs2/dlm: Fix race during lockres mastery ocfs2/dlm: Fix race in adding/removing lockres' to/from the tracking list ocfs2/dlm: Hold off sending lockres drop ref message while lockres is migrating ocfs2/dlm: Clean up errors in dlm_proxy_ast_handler() ocfs2/dlm: Fix a race between migrate request and exit domain ocfs2: One more hamming code optimization. ocfs2: Another hamming code optimization. ocfs2: Don't hand-code xor in ocfs2_hamming_encode(). ocfs2: Enable metadata checksums. ocfs2: Validate superblock with checksum and ecc. ocfs2: Checksum and ECC for directory blocks. ...	2009-01-05 18:32:43 -08:00
Linus Torvalds	520c853466	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: inotify: fix type errors in interfaces fix breakage in reiserfs_new_inode() fix the treatment of jfs special inodes vfs: remove duplicate code in get_fs_type() add a vfs_fsync helper sys_execve and sys_uselib do not call into fsnotify zero i_uid/i_gid on inode allocation inode->i_op is never NULL ntfs: don't NULL i_op isofs check for NULL ->i_op in root directory is dead code affs: do not zero ->i_op kill suid bit only for regular files vfs: lseek(fd, 0, SEEK_CUR) race condition	2009-01-05 18:32:06 -08:00
Nick Piggin	e8c82c2e23	mm lockless pagecache barrier fix An XFS workload showed up a bug in the lockless pagecache patch. Basically it would go into an "infinite" loop, although it would sometimes be able to break out of the loop! The reason is a missing compiler barrier in the "increment reference count unless it was zero" case of the lockless pagecache protocol in the gang lookup functions. This would cause the compiler to use a cached value of struct page pointer to retry the operation with, rather than reload it. So the page might have been removed from pagecache and freed (refcount==0) but the lookup would not correctly notice the page is no longer in pagecache, and keep attempting to increment the refcount and failing, until the page gets reallocated for something else. This isn't a data corruption because the condition will be detected if the page has been reallocated. However it can result in a lockup. Linus points out that ACCESS_ONCE is also required in that pointer load, even if it's absence is not causing a bug on our particular build. The most general way to solve this is just to put an rcu_dereference in radix_tree_deref_slot. Assembly of find_get_pages, before: .L220: movq (%rbx), %rax #* ivtmp.1162, tmp82 movq (%rax), %rdi #, prephitmp.1149 .L218: testb $1, %dil #, prephitmp.1149 jne .L217 #, testq %rdi, %rdi # prephitmp.1149 je .L203 #, cmpq $-1, %rdi #, prephitmp.1149 je .L217 #, movl 8(%rdi), %esi # <variable>._count.counter, c testl %esi, %esi # c je .L218 #, after: .L212: movq (%rbx), %rax #* ivtmp.1109, tmp81 movq (%rax), %rdi #, ret testb $1, %dil #, ret jne .L211 #, testq %rdi, %rdi # ret je .L197 #, cmpq $-1, %rdi #, ret je .L211 #, movl 8(%rdi), %esi # <variable>._count.counter, c testl %esi, %esi # c je .L212 #, (notice the obvious infinite loop in the first example, if page->count remains 0) Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-05 18:31:12 -08:00
Dan Williams	07f2211e4f	dmaengine: remove dependency on async_tx async_tx.ko is a consumer of dma channels. A circular dependency arises if modules in drivers/dma rely on common code in async_tx.ko. It prevents either module from being unloaded. Move dma_wait_for_async_tx and async_tx_run_dependencies to dmaeninge.o where they should have been from the beginning. Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-01-05 18:10:19 -07:00
Michael Kerrisk	4ae8978cf9	inotify: fix type errors in interfaces The problems lie in the types used for some inotify interfaces, both at the kernel level and at the glibc level. This mail addresses the kernel problem. I will follow up with some suggestions for glibc changes. For the sys_inotify_rm_watch() interface, the type of the 'wd' argument is currently 'u32', it should be '__s32' . That is Robert's suggestion, and is consistent with the other declarations of watch descriptors in the kernel source, in particular, the inotify_event structure in include/linux/inotify.h: struct inotify_event { __s32 wd; /* watch descriptor / __u32 mask; / watch mask / __u32 cookie; / cookie to synchronize two events / __u32 len; / length (including nulls) of name / char name[0]; / stub for possible name */ }; The patch makes the changes needed for inotify_rm_watch(). Signed-off-by: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: Robert Love <rlove@google.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-05 11:54:29 -05:00
Christoph Hellwig	4c728ef583	add a vfs_fsync helper Fsync currently has a fdatawrite/fdatawait pair around the method call, and a mutex_lock/unlock of the inode mutex. All callers of fsync have to duplicate this, but we have a few and most of them don't quite get it right. This patch adds a new vfs_fsync that takes care of this. It's a little more complicated as usual as ->fsync might get a NULL file pointer and just a dentry from nfsd, but otherwise gets afile and we want to take the mapping and file operations from it when it is there. Notes on the fsync callers: - ecryptfs wasn't calling filemap_fdatawrite / filemap_fdatawait on the lower file - coda wasn't calling filemap_fdatawrite / filemap_fdatawait on the host file, and returning 0 when ->fsync was missing - shm wasn't calling either filemap_fdatawrite / filemap_fdatawait nor taking i_mutex. Now given that shared memory doesn't have disk backing not doing anything in fsync seems fine and I left it out of the vfs_fsync conversion for now, but in that case we might just not pass it through to the lower file at all but just call the no-op simple_sync_file directly. [and now actually export vfs_fsync] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-05 11:54:28 -05:00
Joel Becker	e06c8227fd	jbd2: Add buffer triggers Filesystems often to do compute intensive operation on some metadata. If this operation is repeated many times, it can be very expensive. It would be much nicer if the operation could be performed once before a buffer goes to disk. This adds triggers to jbd2 buffer heads. Just before writing a metadata buffer to the journal, jbd2 will optionally call a commit trigger associated with the buffer. If the journal is aborted, an abort trigger will be called on any dirty buffers as they are dropped from pending transactions. ocfs2 will use this feature. Initially I tried to come up with a more generic trigger that could be used for non-buffer-related events like transaction completion. It doesn't tie nicely, because the information a buffer trigger needs (specific to a journal_head) isn't the same as what a transaction trigger needs (specific to a tranaction_t or perhaps journal_t). So I implemented a buffer set, with the understanding that journal/transaction wide triggers should be implemented separately. There is only one trigger set allowed per buffer. I can't think of any reason to attach more than one set. Contrast this with a journal or transaction in which multiple places may want to watch the entire transaction separately. The trigger sets are considered static allocation from the jbd2 perspective. ocfs2 will just have one trigger set per block type, setting the same set on every bh of the same type. Signed-off-by: Joel Becker <joel.becker@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:30 -08:00
Jan Kara	7d9056ba20	quota: Export dquot_alloc() and dquot_destroy() functions These are default functions for creating and destroying quota structures and they should be used from filesystems. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:25 -08:00
Jan Kara	5cd9d5bb86	quota: Unexport dqblk_v1.h and dqblk_v2.h Unexport header files dqblk_v[12].h since except for quota format ID they don't contain information userspace should be interested in. Move ID definitions to quota.h. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:25 -08:00
Mark Fasheh	e97fcd95a4	jbd2: Add BH_JBDPrivateStart Add this so that file systems using JBD2 can safely allocate unused b_state bits. In this case, we add it so that Ocfs2 can define a single bit for tracking the validation state of a buffer. Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:24 -08:00

... 2 3 4 5 6 ...

14178 commits