From 12c78ca2ab5e64b636ce085fb08f7685654a5f22 Mon Sep 17 00:00:00 2001 From: Carlos Garnacho Date: Wed, 10 Aug 2016 17:24:15 +0200 Subject: [PATCH 1/4] ACPI / battery: Add sysfs representation after checking _BST Thus move sysfs_add_battery() after acpi_battery_get_state(), which doesn't require the power_supply. Prevents possible hanged tasks if acpi_battery_get_state() fails consistently (and takes a long time in doing so) when called inside acpi_battery_add(). In this situation the battery module first calls sysfs_add_battery(), which creates a power_supply, which spawns an async power_supply_deferred_register_work() task, which shall try to hold the parent battery device mutex (being already held) so this register work is set up after device initialization. If initialization takes long enough the thread will be eventually run and try to hold the mutex before acpi_battery_add() had the chance to finish. Eventually the 5 retries in acpi_battery_update_retry() fail, the error state is propagated, and results in sysfs_remove_battery() being called within the error handling paths of acpi_battery_add(), and the power_supply tear down too. This triggers a cancel_delayed_work_sync() of the deferred_register_work task, which ends up in schedule(). The end result is that the deferred task is blocked trying to acquire the parent device mutex, which is not released because the thread doing initialization (and failure handling) went to sleep awaiting for the deferred task to be cancelled. The hanged tasks look like this: INFO: task kworker/u8:0:6 blocked for more than 120 seconds. ... Call Trace: [] schedule+0x35/0x80 [] schedule_timeout+0x1ec/0x250 [] ? check_preempt_curr+0x52/0x90 [] ? ttwu_do_wakeup+0x19/0xe0 [] wait_for_common+0xc5/0x190 [] ? wake_up_q+0x70/0x70 [] wait_for_completion+0x1d/0x20 [] flush_work+0x111/0x1c0 [] ? flush_workqueue_prep_pwqs+0x1a0/0x1a0 [] __cancel_work_timer+0x9f/0x1d0 [] cancel_delayed_work_sync+0x13/0x20 [] power_supply_unregister+0x37/0xc0 [] sysfs_remove_battery+0x3d/0x52 [battery] [] acpi_battery_add+0x112/0x181 [battery] [] acpi_device_probe+0x54/0x19b [] driver_probe_device+0x22c/0x440 [] __driver_attach+0xd1/0xf0 [] ? driver_probe_device+0x440/0x440 [] bus_for_each_dev+0x6c/0xc0 [] driver_attach+0x1e/0x20 [] bus_add_driver+0x1c3/0x280 [] driver_register+0x60/0xe0 [] acpi_bus_register_driver+0x3b/0x43 [] acpi_battery_init_async+0x1c/0x1e [battery] [] async_run_entry_fn+0x48/0x150 [] process_one_work+0x1e9/0x440 [] worker_thread+0x4b/0x4f0 [] ? process_one_work+0x440/0x440 [] kthread+0xd8/0xf0 [] ret_from_fork+0x1f/0x40 [] ? kthread_worker_fn+0x180/0x180 INFO: task kworker/u8:4:282 blocked for more than 120 seconds. ... Call Trace: [] ? put_prev_entity+0x35/0x8b0 [] schedule+0x35/0x80 [] schedule_preempt_disabled+0xe/0x10 [] __mutex_lock_slowpath+0xb3/0x120 [] mutex_lock+0x1f/0x30 [] power_supply_deferred_register_work+0x2b/0x50 [] process_one_work+0x1e9/0x440 [] worker_thread+0x4b/0x4f0 [] ? process_one_work+0x440/0x440 [] ? process_one_work+0x440/0x440 [] kthread+0xd8/0xf0 [] ret_from_fork+0x1f/0x40 [] ? kthread_worker_fn+0x180/0x180 Making sysfs_add_battery() the last operation here means that the power_supply won't be created yet when the acpi_add_battery() failure handling happens, the deferred task won't even spawn, and sysfs_remove_battery will just skip over the NULL battery->bat. Signed-off-by: Carlos Garnacho Signed-off-by: Rafael J. Wysocki --- drivers/acpi/battery.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c index ab234791a0ba..93ecae55fe6a 100644 --- a/drivers/acpi/battery.c +++ b/drivers/acpi/battery.c @@ -733,15 +733,17 @@ static int acpi_battery_update(struct acpi_battery *battery, bool resume) return result; acpi_battery_init_alarm(battery); } + + result = acpi_battery_get_state(battery); + if (result) + return result; + acpi_battery_quirks(battery); + if (!battery->bat) { result = sysfs_add_battery(battery); if (result) return result; } - result = acpi_battery_get_state(battery); - if (result) - return result; - acpi_battery_quirks(battery); /* * Wakeup the system if battery is critical low From dfa46c50f65b6ca10cea389327a6f1f1749bc633 Mon Sep 17 00:00:00 2001 From: Lv Zheng Date: Wed, 17 Aug 2016 16:22:58 +0800 Subject: [PATCH 2/4] ACPI / button: Fix an issue in button.lid_init_state=ignore mode On most platforms, _LID returning value, lid open/close events are all reliable, but there are exceptions. Some AML tables report wrong initial lid state [1], and some of them never report lid open state [2]. The usage model on such buggy platforms is: 1. The initial lid state returned from _LID is not reliable; 2. The lid open event is not reliable; 3. The lid close event is always reliable, used by the platform firmware to trigger OSPM power saving operations. This usage model is not compliant to the Linux SW_LID model as the Linux userspace is very strict to the reliability of the open events. In order not to trigger issues on such buggy platforms, the ACPI button driver currently implements a lid_init_state=open quirk to send additional "open" event after resuming. However, this is still not sufficient because: 1. Some special usage models (e.x., the dark resume scenario) cannot be supported by this mode. 2. If a "close" event is not used to trigger "suspend", then the subsequent "close" events cannot be seen by the userspace. So we need to stop sending the additional "open" event and switch the driver to lid_init_state=ignore mode and make sure the platform triggered events can be reliably delivered to the userspace. The userspace programs then can be changed to not to be strict to the "open" events on such buggy platforms. Why will the subsequent "close" events be lost? This is because the input layer automatically filters redundant events for switch events. Thus given that the buggy AML tables do not guarantee paired "open"/"close" events, the ACPI button driver currently is not able to guarantee that the platform triggered reliable events can be always be seen by the userspace via SW_LID. This patch adds a mechanism to insert lid events as a compensation for the platform triggered ones to form a complete event switches in order to make sure that the platform triggered events can always be reliably delivered to the userspace. This essentially guarantees that the platform triggered reliable "close" events will always be relibly delivered to the userspace. However this mechanism is not suitable for lid_init_state=open/method as it should not send the complement switch event for the unreliable initial lid state notification. 2 unreliable events can trigger unexpected behavior. Thus this patch only implements this mechanism for lid_init_state=ignore. Known issues: 1. Possible alternative approach This approach is based on the fact that Linux requires a switch event type for LID events. Another approach is to use key event type to implement ACPI lid events. With SW event type, since ACPI button driver inserts wrong lid events, there could be a potential issue that an "open" event issued from some AML update methods could result in a wrong "close" event to be delivered to the userspace. While using KEY event type, there is no such problem. However there may not be such a kind of real case, and if there is such a case, it is worked around in this patch as the complement switch event is only generated for "close" event in order to deliver the reliable "close" event to the userspace. Link: https://bugzilla.kernel.org/show_bug.cgi?id=89211 # [1] Link: https://bugzilla.kernel.org/show_bug.cgi?id=106151 # [1] Link: https://bugzilla.kernel.org/show_bug.cgi?id=106941 # [2] Signed-off-by: Lv Zheng Suggested-by: Dmitry Torokhov Signed-off-by: Rafael J. Wysocki --- drivers/acpi/button.c | 85 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 82 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c index 31abb0bdd4f2..e19f530f1083 100644 --- a/drivers/acpi/button.c +++ b/drivers/acpi/button.c @@ -19,6 +19,8 @@ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ +#define pr_fmt(fmt) "ACPI : button: " fmt + #include #include #include @@ -104,6 +106,8 @@ struct acpi_button { struct input_dev *input; char phys[32]; /* for input device */ unsigned long pushed; + int last_state; + ktime_t last_time; bool suspended; }; @@ -111,6 +115,10 @@ static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier); static struct acpi_device *lid_device; static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD; +static unsigned long lid_report_interval __read_mostly = 500; +module_param(lid_report_interval, ulong, 0644); +MODULE_PARM_DESC(lid_report_interval, "Interval (ms) between lid key events"); + /* -------------------------------------------------------------------------- FS Interface (/proc) -------------------------------------------------------------------------- */ @@ -134,10 +142,79 @@ static int acpi_lid_notify_state(struct acpi_device *device, int state) { struct acpi_button *button = acpi_driver_data(device); int ret; + ktime_t next_report; + bool do_update; - /* input layer checks if event is redundant */ - input_report_switch(button->input, SW_LID, !state); - input_sync(button->input); + /* + * In lid_init_state=ignore mode, if user opens/closes lid + * frequently with "open" missing, and "last_time" is also updated + * frequently, "close" cannot be delivered to the userspace. + * So "last_time" is only updated after a timeout or an actual + * switch. + */ + if (lid_init_state != ACPI_BUTTON_LID_INIT_IGNORE || + button->last_state != !!state) + do_update = true; + else + do_update = false; + + next_report = ktime_add(button->last_time, + ms_to_ktime(lid_report_interval)); + if (button->last_state == !!state && + ktime_after(ktime_get(), next_report)) { + /* Complain the buggy firmware */ + pr_warn_once("The lid device is not compliant to SW_LID.\n"); + + /* + * Send the unreliable complement switch event: + * + * On most platforms, the lid device is reliable. However + * there are exceptions: + * 1. Platforms returning initial lid state as "close" by + * default after booting/resuming: + * https://bugzilla.kernel.org/show_bug.cgi?id=89211 + * https://bugzilla.kernel.org/show_bug.cgi?id=106151 + * 2. Platforms never reporting "open" events: + * https://bugzilla.kernel.org/show_bug.cgi?id=106941 + * On these buggy platforms, the usage model of the ACPI + * lid device actually is: + * 1. The initial returning value of _LID may not be + * reliable. + * 2. The open event may not be reliable. + * 3. The close event is reliable. + * + * But SW_LID is typed as input switch event, the input + * layer checks if the event is redundant. Hence if the + * state is not switched, the userspace cannot see this + * platform triggered reliable event. By inserting a + * complement switch event, it then is guaranteed that the + * platform triggered reliable one can always be seen by + * the userspace. + */ + if (lid_init_state == ACPI_BUTTON_LID_INIT_IGNORE) { + do_update = true; + /* + * Do generate complement switch event for "close" + * as "close" is reliable and wrong "open" won't + * trigger unexpected behaviors. + * Do not generate complement switch event for + * "open" as "open" is not reliable and wrong + * "close" will trigger unexpected behaviors. + */ + if (!state) { + input_report_switch(button->input, + SW_LID, state); + input_sync(button->input); + } + } + } + /* Send the platform triggered reliable event */ + if (do_update) { + input_report_switch(button->input, SW_LID, !state); + input_sync(button->input); + button->last_state = !!state; + button->last_time = ktime_get(); + } if (state) pm_wakeup_event(&device->dev, 0); @@ -411,6 +488,8 @@ static int acpi_button_add(struct acpi_device *device) strcpy(name, ACPI_BUTTON_DEVICE_NAME_LID); sprintf(class, "%s/%s", ACPI_BUTTON_CLASS, ACPI_BUTTON_SUBCLASS_LID); + button->last_state = !!acpi_lid_evaluate_state(device); + button->last_time = ktime_get(); } else { printk(KERN_ERR PREFIX "Unsupported hid [%s]\n", hid); error = -ENODEV; From f9151fcfcdfc662d4bfd74abf60c5e42722acf40 Mon Sep 17 00:00:00 2001 From: Lv Zheng Date: Wed, 17 Aug 2016 16:23:14 +0800 Subject: [PATCH 3/4] ACPI / button: Add document for ACPI control method lid device restrictions This patch adds documentation for the usage model of the control method lid device on some buggy platforms. Signed-off-by: Lv Zheng Acked-by: Benjamin Tissoires Signed-off-by: Rafael J. Wysocki --- Documentation/acpi/acpi-lid.txt | 96 +++++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) create mode 100644 Documentation/acpi/acpi-lid.txt diff --git a/Documentation/acpi/acpi-lid.txt b/Documentation/acpi/acpi-lid.txt new file mode 100644 index 000000000000..effe7af3a5af --- /dev/null +++ b/Documentation/acpi/acpi-lid.txt @@ -0,0 +1,96 @@ +Special Usage Model of the ACPI Control Method Lid Device + +Copyright (C) 2016, Intel Corporation +Author: Lv Zheng + + +Abstract: + +Platforms containing lids convey lid state (open/close) to OSPMs using a +control method lid device. To implement this, the AML tables issue +Notify(lid_device, 0x80) to notify the OSPMs whenever the lid state has +changed. The _LID control method for the lid device must be implemented to +report the "current" state of the lid as either "opened" or "closed". + +For most platforms, both the _LID method and the lid notifications are +reliable. However, there are exceptions. In order to work with these +exceptional buggy platforms, special restrictions and expections should be +taken into account. This document describes the restrictions and the +expections of the Linux ACPI lid device driver. + + +1. Restrictions of the returning value of the _LID control method + +The _LID control method is described to return the "current" lid state. +However the word of "current" has ambiguity, some buggy AML tables return +the lid state upon the last lid notification instead of returning the lid +state upon the last _LID evaluation. There won't be difference when the +_LID control method is evaluated during the runtime, the problem is its +initial returning value. When the AML tables implement this control method +with cached value, the initial returning value is likely not reliable. +There are platforms always retun "closed" as initial lid state. + +2. Restrictions of the lid state change notifications + +There are buggy AML tables never notifying when the lid device state is +changed to "opened". Thus the "opened" notification is not guaranteed. But +it is guaranteed that the AML tables always notify "closed" when the lid +state is changed to "closed". The "closed" notification is normally used to +trigger some system power saving operations on Windows. Since it is fully +tested, it is reliable from all AML tables. + +3. Expections for the userspace users of the ACPI lid device driver + +The ACPI button driver exports the lid state to the userspace via the +following file: + /proc/acpi/button/lid/LID0/state +This file actually calls the _LID control method described above. And given +the previous explanation, it is not reliable enough on some platforms. So +it is advised for the userspace program to not to solely rely on this file +to determine the actual lid state. + +The ACPI button driver emits the following input event to the userspace: + SW_LID +The ACPI lid device driver is implemented to try to deliver the platform +triggered events to the userspace. However, given the fact that the buggy +firmware cannot make sure "opened"/"closed" events are paired, the ACPI +button driver uses the following 3 modes in order not to trigger issues. + +If the userspace hasn't been prepared to ignore the unreliable "opened" +events and the unreliable initial state notification, Linux users can use +the following kernel parameters to handle the possible issues: +A. button.lid_init_state=method: + When this option is specified, the ACPI button driver reports the + initial lid state using the returning value of the _LID control method + and whether the "opened"/"closed" events are paired fully relies on the + firmware implementation. + This option can be used to fix some platforms where the returning value + of the _LID control method is reliable but the initial lid state + notification is missing. + This option is the default behavior during the period the userspace + isn't ready to handle the buggy AML tables. +B. button.lid_init_state=open: + When this option is specified, the ACPI button driver always reports the + initial lid state as "opened" and whether the "opened"/"closed" events + are paired fully relies on the firmware implementation. + This may fix some platforms where the returning value of the _LID + control method is not reliable and the initial lid state notification is + missing. + +If the userspace has been prepared to ignore the unreliable "opened" events +and the unreliable initial state notification, Linux users should always +use the following kernel parameter: +C. button.lid_init_state=ignore: + When this option is specified, the ACPI button driver never reports the + initial lid state and there is a compensation mechanism implemented to + ensure that the reliable "closed" notifications can always be delievered + to the userspace by always pairing "closed" input events with complement + "opened" input events. But there is still no guarantee that the "opened" + notifications can be delivered to the userspace when the lid is actually + opens given that some AML tables do not send "opened" notifications + reliably. + In this mode, if everything is correctly implemented by the platform + firmware, the old userspace programs should still work. Otherwise, the + new userspace programs are required to work with the ACPI button driver. + This option will be the default behavior after the userspace is ready to + handle the buggy AML tables. From 060d791f760bd80c0f96de668a662cab0286146a Mon Sep 17 00:00:00 2001 From: Mika Westerberg Date: Fri, 23 Sep 2016 17:57:06 +0300 Subject: [PATCH 4/4] ACPI / documentation: Use recommended name in GPIO property names The recommended property name for all kinds of GPIOs is to end it with "-gpios" even if there is only one GPIO. Update the documentation to follow this fact. Signed-off-by: Mika Westerberg Signed-off-by: Rafael J. Wysocki --- Documentation/acpi/gpio-properties.txt | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/Documentation/acpi/gpio-properties.txt b/Documentation/acpi/gpio-properties.txt index f35dad11f0de..5aafe0b351a1 100644 --- a/Documentation/acpi/gpio-properties.txt +++ b/Documentation/acpi/gpio-properties.txt @@ -28,8 +28,8 @@ index, like the ASL example below shows: ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"), Package () { - Package () {"reset-gpio", Package() {^BTH, 1, 1, 0 }}, - Package () {"shutdown-gpio", Package() {^BTH, 0, 0, 0 }}, + Package () {"reset-gpios", Package() {^BTH, 1, 1, 0 }}, + Package () {"shutdown-gpios", Package() {^BTH, 0, 0, 0 }}, } }) } @@ -48,7 +48,7 @@ Since ACPI GpioIo() resource does not have a field saying whether it is active low or high, the "active_low" argument can be used here. Setting it to 1 marks the GPIO as active low. -In our Bluetooth example the "reset-gpio" refers to the second GpioIo() +In our Bluetooth example the "reset-gpios" refers to the second GpioIo() resource, second pin in that resource with the GPIO number of 31. ACPI GPIO Mappings Provided by Drivers @@ -83,8 +83,8 @@ static const struct acpi_gpio_params reset_gpio = { 1, 1, false }; static const struct acpi_gpio_params shutdown_gpio = { 0, 0, false }; static const struct acpi_gpio_mapping bluetooth_acpi_gpios[] = { - { "reset-gpio", &reset_gpio, 1 }, - { "shutdown-gpio", &shutdown_gpio, 1 }, + { "reset-gpios", &reset_gpio, 1 }, + { "shutdown-gpios", &shutdown_gpio, 1 }, { }, };