remarkable-linux/fs/ocfs2/dlm
Changwei Ge e250a19937 ocfs2: fix cluster hang after a node dies
commit 1c01967116 upstream.

When a node dies, other live nodes have to choose a new master for an
existed lock resource mastered by the dead node.

As for ocfs2/dlm implementation, this is done by function -
dlm_move_lockres_to_recovery_list which marks those lock rsources as
DLM_LOCK_RES_RECOVERING and manages them via a list from which DLM
changes lock resource's master later.

So without invoking dlm_move_lockres_to_recovery_list, no master will be
choosed after dlm recovery accomplishment since no lock resource can be
found through ::resource list.

What's worse is that if DLM_LOCK_RES_RECOVERING is not marked for lock
resources mastered a dead node, it will break up synchronization among
nodes.

So invoke dlm_move_lockres_to_recovery_list again.

Fixs: 'commit ee8f7fcbe6 ("ocfs2/dlm: continue to purge recovery lockres when recovery master goes down")'
Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373CED6E0F9@H3CMLB14-EX.srv.huawei-3com.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reported-by: Vitaly Mayatskih <v.mayatskih@gmail.com>
Tested-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:37:04 +01:00
..
dlmapi.h ocfs2/trivial: Remove trailing whitespaces 2010-01-25 19:20:51 -08:00
dlmast.c o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper 2015-02-10 14:30:30 -08:00
dlmcommon.h ocfs2/dlm: continue to purge recovery lockres when recovery master goes down 2016-08-02 17:31:41 -04:00
dlmconvert.c ocfs2/dlm: fix race between convert and migration 2016-09-19 15:36:16 -07:00
dlmconvert.h
dlmdebug.c locking/atomic, kref: Add kref_read() 2017-01-14 11:37:18 +01:00
dlmdebug.h ocfs2/dlm: fix memory leak of dlm_debug_ctxt 2016-07-26 16:19:19 -07:00
dlmdomain.c sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
dlmdomain.h ocfs2: dlm: dlmdomain: remove unused function 2015-02-10 14:30:29 -08:00
dlmlock.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
dlmmaster.c scripts/spelling.txt: add "unneded" pattern and fix typo instances 2017-02-27 18:43:47 -08:00
dlmrecovery.c ocfs2: fix cluster hang after a node dies 2017-11-24 08:37:04 +01:00
dlmthread.c ocfs2/dlm: continue to purge recovery lockres when recovery master goes down 2016-08-02 17:31:41 -04:00
dlmunlock.c locking/atomic, kref: Add kref_read() 2017-01-14 11:37:18 +01:00
Makefile ocfs2: remove versioning information 2014-01-21 16:19:41 -08:00