alistair23-linux

redonkable

Author	SHA1	Message	Date
Andreas Gruenbacher	77c556f663	drbd: Add struct drbd_resource In a first step, each resource has exactly one connection, and both objects are allocated at the same time. The final result will be one resource and zero or more connections. Only allow to delete a resource if all its connections are C_STANDALONE. Stop the worker threads of all connections early enough. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2014-02-17 16:44:53 +01:00
Andreas Gruenbacher	05a10ec790	drbd: Improve some function and variable naming Rename functions conn_destroy() -> drbd_destroy_connection(), drbd_minor_destroy() -> drbd_destroy_device() drbd_adm_add_minor() -> drbd_adm_add_minor() drbd_adm_delete_minor() -> drbd_adm_del_minor() Rename global variable minors to drbd_devices Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2014-02-17 16:44:52 +01:00
Andreas Gruenbacher	a6b32bc3ce	drbd: Introduce "peer_device" object between "device" and "connection" In a setup where a device (aka volume) can replicate to multiple peers and one connection can be shared between multiple devices, we need separate objects to represent devices on peer nodes and network connections. As a first step to introduce multiple connections per device, give each drbd_device object a single drbd_peer_device object which connects it to a drbd_connection object. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2014-02-17 16:44:51 +01:00
Andreas Gruenbacher	bde89a9e15	drbd: Rename drbd_tconn -> drbd_connection sed -i -e 's:all_tconn:connections:g' -e 's:tconn:connection:g' Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2014-02-17 16:44:47 +01:00
Andreas Gruenbacher	b30ab7913b	drbd: Rename "mdev" to "device" sed -i -e 's:mdev:device:g' Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2014-02-17 16:42:24 +01:00
Andreas Gruenbacher	5476169793	drbd: Rename struct drbd_conf -> struct drbd_device sed -i -e 's:\<drbd_conf\>:drbd_device:g' Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2014-02-17 16:36:44 +01:00
Andreas Gruenbacher	a3603a6e3b	drbd: Split off on-the-wire protocol definitions Keep the protocol definitions separate from the kernel code; they are useful in their own right. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2014-02-17 16:27:49 +01:00
Philipp Reisner	8e22943430	drbd: Add missing error goto Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2014-02-17 16:24:47 +01:00
Rashika Kheria	4b7a530f6b	drivers: block: Mark functions as static in drbd_nl.c Mark functions conn_khelper(), nla_put_drbd_cfg_context(), nla_put_status_info() and get_one_status() as static in drbd/drbd_nl.c because they are not used outside this file. This eliminates the following warnings in drbd/drbd_nl.c: drivers/block/drbd/drbd_nl.c:365:5: warning: no previous prototype for ‘conn_khelper’ [-Wmissing-prototypes] drivers/block/drbd/drbd_nl.c:2727:5: warning: no previous prototype for ‘nla_put_drbd_cfg_context’ [-Wmissing-prototypes] drivers/block/drbd/drbd_nl.c:2753:5: warning: no previous prototype for ‘nla_put_status_info’ [-Wmissing-prototypes] drivers/block/drbd/drbd_nl.c:2895:5: warning: no previous prototype for ‘get_one_status’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2014-02-17 16:19:38 +01:00
Lars Ellenberg	35f47ef1a1	drbd: avoid to shrink max_bio_size due to peer re-configuration For a long time, the receiving side has spread "too large" incoming requests over multiple bios. No need to shrink our max_bio_size (max_hw_sectors) if the peer is reconfigured to use a different storage. The problem manifests itself if we are not the top of the device stack (DRBD is used a LVM PV). A hardware reconfiguration on the peer may cause the supported max_bio_size to shrink, and the connection handshake would now unnecessarily shrink the max_bio_size on the active node. There is no way to notify upper layers that they have to "re-stack" their limits. So they won't notice at all, and may keep submitting bios that are suddenly considered "too large for device". We already check for compatibility and ignore changes on the peer, the code only was masked out unless we have a fully established connection. We just need to allow it a bit earlier during the handshake. Also consider max_hw_sectors in our merge bvec function, just in case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-11-08 09:10:29 -07:00
Philipp Reisner	57737adc96	drbd: Fix adding of new minors with freshly created meta data Online adding of new minors with freshly created meta data to an resource with an established connection failed, with a wrong state transition on one side on one side of the new minor. Freshly created meta-data has a la_size (last agreed size) of 0. When we online add such devices, the code wrongly got into the code path for resyncing new storage that was added while the disk was detached. Fixed that by making the GREW from ZERO a special case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-11-08 09:10:28 -07:00
Philipp Reisner	d752b26960	drbd: Allow online change of al-stripes and al-stripe-size Allow to change the AL layout with an resize operation. For that the reisze command gets two new fields: al_stripes and al_stripe_size. In order to make the operation crash save: 1) Lock out all IO and MD-IO 2) Write the super block with MDF_PRIMARY_IND clear 3) write the bitmap to the new location (all zeros, since we allow only while connected) 4) Initialize the new AL-area 5) Write the super block with the restored MDF_PRIMARY_IND. 6) Unfreeze all IO Since the AL-layout has no influence on the protocol, this operation needs to be beforemed on both sides of a resource (if intended). Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-06-28 16:04:36 +02:00
Philipp Reisner	e96c96333f	drbd: Constants should be UPPERCASE Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-06-28 16:04:36 +02:00
Philipp Reisner	28e448bb30	drbd: Ignore the exit code of a fence-peer handler if it returns too late In case the connection was established and lost again before the a fence-peer handler returns, ignore the exit code of this instance. (And use the exit code of the later started instance) Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-06-28 16:04:36 +02:00
Andreas Gruenbacher	f9eb7bf424	drbd: Fix rcu_read_lock balance on error path Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-06-28 16:04:36 +02:00
Lars Ellenberg	a3f8f7dc7a	drbd: validate resync_after dependency on attach already We validated resync_after dependencies, if changed via disk-options. But we did not validate them when first created via attach. We also did not check or cleanup dependencies that used to be correct, but now point to meanwhile removed minor devices. If the drbd_resync_after_valid() validation in disk-options tried to follow a dependency chain in this way, this could lead to NULL pointer dereference. Validate resync_after settings in drbd_adm_attach() already, as well as in drbd_adm_disk_opts(), and and only reject dependency loops. Depending on non-existing disks is allowed and equivalent to no dependency. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-28 10:10:25 -06:00
Philipp Reisner	2bd5ed5d67	drbd: Fix disconnect to keep the peer disk state if connection breaks during operation The issue was that if the connection broke while we did the gracefull state change to C_DISCONNECTING (C_TEARDOWN), then we returned a success code from the state engine. (SS_CW_NO_NEED) The result of that is that we missed to call the fence-peer script in such a case. Fixed that by introducing a new error code (SS_OUTDATE_WO_CONN). This one should never reach back into user space. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-28 10:10:25 -06:00
Philipp Reisner	0b2dafcd9f	drbd: drop now useless duplicate state request from invalidate Patch best viewed with git diff --ignore-space-change. Now that we attempt the fallback to local bitmap operation only when disconnected, we can safely drop the extra "silent" state request from both invalidate and invalidate-remote. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-28 10:10:24 -06:00
Philipp Reisner	9376d9f8b9	drbd: move invalidating the whole bitmap out of after_state ch() To avoid other state change requests, after passing through sanitize_state(), to be mistaken for an invalidate, move the "set all bits as out-of-sync" into the invalidate path. Make invalidate and invalidate-remote behave consistently wrt. current connection state (need either an established replication link, or really be disconnected). Also mention that in the documentation. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-28 10:10:24 -06:00
Lars Ellenberg	5bbcf5e6ab	drbd: adjust upper limit for activity log extents Now that the on-disk activity-log ring buffer size is adjustable, the maximum active set can become larger, and is now limited by the use of 16bit "labels". This increases the maximum working set from 6433 to 65534 extents, each of which covers an area of 4MiB. Which means that if you use the maximum, you'd have to resync more than 250 GiB after an unclean Primary shutdown. With capable backend storage and replication links, this is entirely feasible. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 22:18:09 -06:00
Lars Ellenberg	113fef9e20	drbd: prepare to queue write requests on a submit worker Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 18:14:40 -06:00
Lars Ellenberg	c04ccaa669	drbd: read meta data early, base on-disk offsets on super block We used to calculate all on-disk meta data offsets, and then compare the stored offsets, basically treating them as magic numbers. Now with the activity log striping, the activity log size is no longer fixed. We need to first read the super block, then base the activity log and bitmap offsets on the stored offsets/al stripe settings. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 18:13:59 -06:00
Lars Ellenberg	cccac9857d	drbd: mechanically rename la_size to la_size_sect Make it obvious that this value is in units of 512 Byte sectors. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 18:13:59 -06:00
Lars Ellenberg	68e41a43f1	drbd: use the cached meta_dev_idx Now we have the cached meta_dev_idx member, we can get rid of a few rcu_read_lock() sections and rcu_dereference(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 18:13:59 -06:00
Lars Ellenberg	3a4d4eb3cb	drbd: prepare for new striped layout of activity log Introduce two new on-disk meta data fields: al_stripes and al_stripe_size_4k The intended use case is activity log on RAID 0 or similar. Logically consecutive transactions will advance their on-disk position by al_stripe_size_4k 4kB (transaction sized) blocks. Right now, these are still asserted to be the backward compatible values al_stripes = 1, al_stripe_size_4k = 8 (which amounts to 32kB). Also introduce a caching member for meta_dev_idx in the in-core structure: even though it is initially passed in in the rcu-protected disk_conf structure, it cannot change without a detach/attach cycle. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 18:13:59 -06:00
Lars Ellenberg	ae8bf312e9	drbd: cleanup ondisk meta data layout calculations and defines Add a comment about our meta data layout variants, and rename a few defines (e.g. MD_RESERVED_SECT -> MD_128MB_SECT) to make it clear that they are short hand for fixed constants, and not arbitrarily to be redefined as one may see fit. Properly pad struct meta_data_on_disk to 4kB, and initialize to zero not only the first 512 Byte, but all of it in drbd_md_sync(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-03-22 18:13:59 -06:00
Philipp Reisner	ef86b77957	drbd: Fix drbdsetup wait-connect, wait-sync etc... commands This was introduces when moving the code over from the 8.3 codebase with commit `328e0f125b` Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-12-06 13:04:34 +01:00
Lars Ellenberg	691631c065	drbd: respect no-md-barriers setting also when changed online via disk-options We need to propagate the configuration into the flag bits, or it won't be effective. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-12-06 13:00:04 +01:00
Philipp Reisner	986836503e	Merge branch 'drbd-8.4_ed6' into for-3.8-drivers-drbd-8.4_ed6	2012-11-09 14:20:23 +01:00
Philipp Reisner	328e0f125b	drbd: Broadcast sync progress no more often than once per second Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:43 +01:00
Philipp Reisner	4035e4c2eb	drbd: Fix clearing of MDF_AL_DISABLED Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:42 +01:00
Lars Ellenberg	edc9f5eb7a	drbd: always write bitmap on detach If we detach due to local read-error (which sets a bit in the bitmap), stay Primary, and then re-attach (which re-reads the bitmap from disk), we potentially lost the "out-of-sync" (or, "bad block") information in the bitmap. Always (try to) write out the changed bitmap pages before going diskless. That way, we don't lose the bit for the bad block, the next resync will fetch it from the peer, and rewrite it locally, which may result in block reallocation in some lower layer (or the hardware), and thereby "heal" the bad blocks. If the bitmap writeout errors out as well, we will (again: try to) mark the "we need a full sync" bit in our super block, if it was a READ error; writes are covered by the activity log already. If that superblock does not make it to disk either, we are sorry. Maybe we just lost an entire disk or controller (or iSCSI connection), and there actually are no bad blocks at all, so we don't need to re-fetch from the peer, there is no "auto-healing" necessary. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:41 +01:00
Philipp Reisner	19fffd7b03	drbd: Call drbd_md_sync() explicitly after a state change on the connection Without this, the meta-data gets updates after 5 seconds by the md_sync_timer. Better to do it immeditaly after a state change. If the asender detects a network failure, it may take a bit until the worker processes the according after-conn-state-change work item. The worker might be blocked in sending something, i.e. it takes until it gets into its timeout. That is 6 seconds by default which is longer than the 5 seconds of the md_sync_timer. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:08 +01:00
Lars Ellenberg	0ee98e2eb0	drbd: temporarily suspend io in drbd_adm_disk_opts drbd_adm_disk_opts() does wait_event(mdev->al_wait, lc_try_lock(mdev->act_log)); drbd_al_shrink(mdev); If the device is very busy, this can take a very long time to succeed. Fix this by temporarily suspending IO, then quickly change the settings, and resume. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:08:20 +01:00
Philipp Reisner	39a1aa7f49	drbd: Protect accesses to the uuid set with a spinlock There is at least the worker context, the receiver context, the context of receiving netlink packts. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:08:04 +01:00
Philipp Reisner	fef45d297e	drbd: Write all pages of the bitmap after an online resize We need to write the whole bitmap after we moved the meta data due to an online resize operation. With the support for one peta byte devices bitmap IO was optimized to only write out touched pages. This optimization must be turned off when writing the bitmap after an online resize. This issue was introduced with drbd-8.3.10. The impact of this bug is that after an online resize, the next resync could become larger than expected. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:51 +01:00
Lars Ellenberg	eb12010e9a	drbd: disambiguation, s/ERR_DISCARD/ERR_DISCARD_IMPOSSIBLE/ If for some reason (typically "split-brained" cluster manager) drbd replica data has diverged, we can chose a victim, and reconnect using "--discard-my-data", causing the victim to become sync-target, fetching all changed blocks from the peer. If we are Primary, we are potentially in use, and we refuse to "roll back" changes to the data below the page cache and other users. Rename the error symbol for this to ERR_DISCARD_IMPOSSIBLE. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:50 +01:00
Lars Ellenberg	427c0434fc	drbd: disambiguation, s/DISCARD_CONCURRENT/RESOLVE_CONFLICTS/ We don't discard anything here, really. We resolve conflicting, concurrent writes to overlapping data blocks. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:49 +01:00
Philipp Marek	3174f8c504	drbd: pass some more information to userspace. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:45 +01:00
Lars Ellenberg	58ffa580a7	drbd: introduce stop-sector to online verify We now can schedule only a specific range of sectors for online verify, or interrupt a running verify without interrupting the connection. Had to bump the protocol version differently, we are now 101. Added verify_can_do_stop_sector() { protocol >= 97 && protocol != 100; } Also, the return value convention for worker callbacks has changed, we returned "true/false" for "keep the connection up" in 8.3, we return 0 for success and <= for failure in 8.4. Affected: receive_state() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:05:32 +01:00
Lars Ellenberg	970fbde1f1	drbd: flush drbd work queue before invalidate/invalidate remote If you do back to back wait-sync/invalidate on a Primary in a tight loop, during application IO load, you could trigger a race: kernel: block drbd6: FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending? Fix this by changing the order of the drbd_queue_work() and the wake_up() in dec_ap_pending(), and adding the additional drbd_flush_workqueue() before requesting the full sync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:41 +01:00
Lars Ellenberg	a324896b17	drbd: do not reset rs_pending_cnt too early Fix asserts like block drbd0: in got_BlockAck:4634: rs_pending_cnt = -35 < 0 ! We reset the resync lru cache and related information (rs_pending_cnt), once we successfully finished a resync or online verify, or if the replication connection is lost. We also need to reset it if a resync or online verify is aborted because a lower level disk failed. In that case the replication link is still established, and we may still have packets queued in the network buffers which want to touch rs_pending_cnt. We do not have any synchronization mechanism to know for sure when all such pending resync related packets have been drained. To avoid this counter to go negative (and violate the ASSERT that it will always be >= 0), just do not reset it when we lose a disk. It is good enough to make sure it is re-initialized before the next resync can start: reset it when we re-attach a disk. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:40 +01:00
Lars Ellenberg	6f3465ed82	drbd: report congestion if we are waiting for some userland callback If the drbd worker thread is synchronously waiting for some userland callback, we don't want some casual pageout to block on us. Have drbd_congested() report congestion in that case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:39 +01:00
Lars Ellenberg	0c84966601	drbd: differentiate between normal and forced detach Aborting local requests (not waiting for completion from the lower level disk) is dangerous: if the master bio has been completed to upper layers, data pages may be re-used for other things already. If local IO is still pending and later completes, this may cause crashes or corrupt unrelated data. Only abort local IO if explicitly requested. Intended use case is a lower level device that turned into a tarpit, not completing io requests, not even doing error completion. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:39 +01:00
Lars Ellenberg	27012382bc	drbd: take error path in drbd_adm_down if interrupted by signal drbd_adm_down() does adm_detach(), which can fail with various error codes, or be interrupted by a signal. The interrupted by signal case was not properly handled, leading to block drbd0: ASSERT( mdev->state.disk == D_DISKLESS && mdev->state.conn == C_STANDALONE ) in drbd/drbd_worker.c and further to destroying objects while still in use, and resulting crashes. Detect the interruption, and take the error path out. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:37 +01:00
Lars Ellenberg	b6dd1a8976	drbd: remove struct drbd_tl_epoch objects (barrier works) cherry-picked and adapted from drbd 9 devel branch DRBD requests (struct drbd_request) are already on the per resource transfer log list, and carry their epoch number. We do not need to additionally link them on other ring lists in other structs. The drbd sender thread can recognize itself when to send a P_BARRIER, by tracking the currently processed epoch, and how many writes have been processed for that epoch. If the epoch of the request to be processed does not match the currently processed epoch, any writes have been processed in it, a P_BARRIER for this last processed epoch is send out first. The new epoch then becomes the currently processed epoch. To not get stuck in drbd_al_begin_io() waiting for P_BARRIER_ACK, the sender thread also needs to handle the case when the current epoch was closed already, but no new requests are queued yet, and send out P_BARRIER as soon as possible. This is done by comparing the per resource "current transfer log epoch" (tconn->current_tle_nr) with the per connection "currently processed epoch number" (tconn->send.current_epoch_nr), while waiting for new requests to be processed in wait_for_work(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:35 +01:00
Philipp Reisner	9a51ab1c1b	drbd: New disk option al-updates By disabling al-updates one might increase performace. The price for that is that in case a crashed primary (that had al-updates disabled) is reintegraded, it will receive a full-resync instead of a bitmap based resync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:31 +01:00
Andreas Gruenbacher	26ec92871b	drbd: Stop using NLA_PUT*(). These macros no longer exist in kernel version v3.5-rc1. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:30 +01:00
Lars Ellenberg	5016b82a49	drbd: fix race between drbdadm invalidate/verify and finishing resync When a resync or online verify is finished or aborted, drbd does a bulk write-out of changed bitmap pages. If in that very moment a new verify or resync is triggered, this can race: ASSERT( !test_bit(BITMAP_IO, &mdev->flags) ) in drbd_main.c FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending? and similar. This can be observed with e.g. tight invalidate loops in test scripts, and probably has no real-life implication. Still, that race can be solved by first quiescen the device, before starting a new resync or verify. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:27 +01:00
Philipp Reisner	a1096a6e9d	drbd: Delay/reject other state changes while establishing a connection Changes to the role and disk state should be delayed or rejected while we establish a connection. This is necessary, since the peer will base its resync decision on the UUIDs and the state we sent in the drbd_connect() function. The most prominent example for this race is becoming primary after sending state and UUIDs and before the state changes to C_WF_CONNECTION. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:26 +01:00
Philipp Reisner	27eb13e99b	drbd: Fixed processing of disk-barrier, disk-flushes and disk-drain Since drbd_bump_write_ordering() is called in the attaching process while the disk state is D_ATTACHING, it was not considering these three flags during attach. A call to this function was missing form drbd_adm_disk_opts(). Fixed both issues. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:25 +01:00
Philipp Reisner	25b0d6c8c1	drbd: Reinstate disabling AL updates with invalidate-remote Commit d0ef827e (drbd: switch configuration interface from connector to genetlink) introduced a regression by removing the ability to set all bits in the out of sync bitmap and to suspend updates to the activity log of a disconnected device via the invalidate-remote management call. Credits for reporting the issue are going to Arne Redlich. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:15 +01:00
Philipp Reisner	4b0007c0e8	drbd: Move write_ordering from mdev to tconn This is necessary in order to prepare the move of the (receiver side) epoch list from the device (mdev) to the connection (tconn) objects. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:07 +01:00
Philipp Reisner	43de7c852b	drbd: Fixes from the drbd-8.3 branch * drbd-8.3: drbd: O_SYNC gives EIO on ramdisks for some kernels (eg. RHEL6). drbd: send intermediate state change results to the peer Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:06 +01:00
Philipp Reisner	0cfac5dd90	drbd: Fixes from the drbd-8.3 branch * drbd-8.3: drbd: fix spurious meta data IO "error" drbd: Fixed a race condition between detach and start of resync drbd: fix harmless race to not trigger an ASSERT drbd: Derive sync-UUIDs only from the bitmap-uuid if it is non-zero drbd: Fixed current UUID generation (regression introduced recently, after 8.3.11) Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:05 +01:00
Philipp Reisner	9bcd252182	drbd: fix "stalled" empty resync With sync-after dependencies, given "lucky" timing of pause/unpause events, and the end of an empty (0 bits set) resync was sometimes not detected on the SyncTarget, leading to a "stalled" SyncSource state. Fixed this by expecting not only "Inconsistent -> UpToDate" but also "Consistent -> UpToDate" transitions for the peer disk state to end a resync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:04 +01:00
Lars Ellenberg	25e409321a	drbd: fix connect failure with all default net-options If no net-options are configured (all on their default), no DRBD_NLA_NET_CONF will be passed to the kernel. The kernel must not require its presence, there is no required option in there. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:03 +01:00
Andreas Gruenbacher	a209b4aec3	drbd: Update some outdated comments to match the code Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:02 +01:00
Philipp Reisner	c4e7afdc01	drbd: Remove unused code Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:58:02 +01:00
Andreas Gruenbacher	7d4c782cbd	drbd: Fix the data-integrity-alg setting The last data-integrity-alg fix made data integrity checking work when the algorithm was changed for an established connection, but the common case of configuring the algorithm before connecting was still broken. Fix that. Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:59 +01:00
Andreas Gruenbacher	f2257a56ee	drbd: Allow to create devices with a minor number > minor_count The minor_count module/kernel parameter serves to scale the size of drbd's internal memory pool, but it is no longer a limit for the number of minors or the minor number. (Minor numbers can be arbitrarily high within the allowed limit of 2^20.) Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:54 +01:00
Lars Ellenberg	367d675da8	drbd: report net config even for resources without a single volume Currently it is legal (though unusual) to create and connect a resource, before adding in all necessary volumes. We should include the network configuration details, even if we don't have a single volume (yet). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:53 +01:00
Philipp Reisner	e0e1665381	drbd: Correctly handle resources without volumes Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:52 +01:00
Philipp Reisner	369bea6371	drbd: Fixed removal of volumes/devices from connected resources When removing a volume/device we need to switch the connection status of the peer back into WFReportParams. Before this fix it was left in Connected state. That means that the peer device continued to inform us about state changes, etc... But we deleted that minor -> protocol error. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:51 +01:00
Lars Ellenberg	d5d7ebd422	drbd: on attach, enforce clean meta data Detection of unclean shutdown has moved into user space. The kernel code will, whenever it updates the meta data, mark it as "unclean", and will refuse to attach to such unclean meta data. "drbdadm up" now schedules "drbdmeta apply-al", which will apply the activity log to the bitmap, and/or reinitialize it, if necessary, as well as set a "clean" indicator flag. This moves a bit code out of kernel space. As a side effect, it also prevents some 8.3 module from accidentally ignoring the 8.4 style activity log, if someone should downgrade, whether on purpose, or accidentally because he changed kernel versions without providing an 8.4 for the new kernel, and the new kernel comes with in-tree 8.3. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:51 +01:00
Philipp Reisner	cdfda633d2	drbd: detach from frozen backing device * drbd-8.3: documentation: Documented detach's --force and disk's --disk-timeout drbd: Implemented the disk-timeout option drbd: Force flag for the detach operation drbd: Allow new IOs while the local disk in in FAILED state drbd: Bitmap IO functions can not return prematurely if the disk breaks drbd: Added a kref to bm_aio_ctx drbd: Hold a reference to ldev while doing meta-data IO drbd: Keep a reference to the bio until the completion handler finished drbd: Implemented wait_until_done_or_disk_failure() drbd: Replaced md_io_mutex by an atomic: md_io_in_use drbd: moved md_io into mdev drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk drbd: Keep a reference to barrier acked requests Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:50 +01:00
Philipp Reisner	9510b2411d	drbd: Fixed state transitions in case reading meta data failes Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:50 +01:00
Philipp Reisner	2ffca4f3ee	drbd: Improve compatibility with drbd's older than 8.3.7 Regression introduced with 8.3.11 commit: drbd: Take a more conservative approach when deciding max_bio_size Never ever tell an older drbd, that we support more than 32KiB in a single data request (packet). Never believe an older drbd, that is supports more than 32KiB in a single data request (packet) Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:49 +01:00
Andreas Gruenbacher	d0fa7fd680	drbd: Remove dead code Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:47 +01:00
Andreas Gruenbacher	afbbfa88bc	drbd: Allow to pass resource options to the new-resource command This is equivalent to how the attach and connect commands work. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:46 +01:00
Andreas Gruenbacher	089c075d88	drbd: Convert the generic netlink interface to accept connection endpoints Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:46 +01:00
Andreas Gruenbacher	44e52cfaa2	drbd: Rename DRBD_ADM_NEED_{CONN -> RESOURCE} Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:46 +01:00
Andreas Gruenbacher	01b39b50d3	drbd: Split off netlink mandatory attribute handling into separate file Duplicate this file in the kernel module and in user space; both sides need it. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:45 +01:00
Andreas Gruenbacher	7c3063cc6f	drbd: Also need to check for DRBD_GENLA_F_MANDATORY flags before nla_find_nested() This is done by introducing drbd_nla_find_nested() which handles the flag before calling nla_find_nested(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:45 +01:00
Andreas Gruenbacher	789c1b626c	drbd: Use the terminology suggested by the command names in the source code and messages Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:44 +01:00
Lars Ellenberg	67b58bf723	drbd: spelling fix: too small It is not "to small", but "too small". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:44 +01:00
Andreas Gruenbacher	c75b9b10e7	drbd: Don't use empty nested netlink attributes Before mainline commit `ea5693cc` (v2.6.29-rc1), empty nested netlink attributes were not allowed. Fix that by leaving out nested attributes if they are empty and by allowing the top-level attributes to be missing. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:57:40 +01:00
Andreas Gruenbacher	1e2a2551ee	drbd: drbd_adm_prepare(): Pass through error codes Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:56 +01:00
Philipp Reisner	d659f2aaea	drbd: Send PROTOCOL_UPDATE packets when appropriate Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:54 +01:00
Philipp Reisner	dcb20d1a8e	drbd: Refuse to change network options online when... * the peer does not speak protocol_version 100 and the user wants to change one of: - wire_protocol - two_primaries - integrity_alg * the user wants to remove the allow_two_primaries flag when there are two primaries Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:52 +01:00
Andreas Gruenbacher	95f8efd08b	drbd: Fix the upper limit of resync-after The 32-bit resync_after netlink field takes a device minor number as parameter, which is no longer limited to 255. We cannot statically verify which device numbers are valid, so set the ummer limit to the highest possible signed 32-bit integer. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:51 +01:00
Andreas Gruenbacher	6139f60dc1	drbd: Rename the want_lose field/flag to discard_my_data This is what it is called in config files and on the command line as well. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:49 +01:00
Andreas Gruenbacher	6f9b5f84f5	drbd: Make broadcast events return NO_ERROR Instead of returning a ret_code outside of the range of enum drbd_ret_code, use NO_ERROR to indicate success. This way, ret_code has the same meaning in all packets. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:48 +01:00
Philipp Reisner	c141ebda03	drbd: Removing drbd_cfg_rwsem * Updates to all configuration items is done under genl_lock(). Including removal of mdevs or tconns. * All read non sleeping read sides are protected by rcu * All sleeping read sides keep reference counts to keep the objects alive Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:48 +01:00
Philipp Reisner	ec0bddbc55	drbd: Use RCU for the drbd_tconns list Preparing removal of drbd_cfg_rwsem Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:47 +01:00
Philipp Reisner	81fa2e675c	drbd: Refcounting for mdev objects Preparing removal of drbd_cfg_rwsem Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:47 +01:00
Andreas Gruenbacher	e544046ab8	drbd: Turn no-md-flushes into md-flushes={yes\|no} Change the --no-md-flushes drbdsetup command line option as well as the no_md_flush netlink packet. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:46 +01:00
Philipp Reisner	813472ced7	drbd: RCU for rs_plan_s This removes the issue with using peer_seq_lock out of different contexts. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:44 +01:00
Philipp Reisner	d589a21e5d	drbd: Enforce limits of disk_conf members; centralized these checks Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:44 +01:00
Philipp Reisner	9958c857c7	drbd: Made the fifo object a self contained object (preparing for RCU) * Moved rs_planed into it, named total * When having a pointer to the object the values can be embedded into the fifo object. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:43 +01:00
Philipp Reisner	daeda1cca9	drbd: RCU for disk_conf Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:43 +01:00
Lars Ellenberg	563e4cf25e	drbd: Introduce __s32_field in the genetlink macro magic ...and drop explicit typecasts (int)meta_dev_idx < 0. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:43 +01:00
Philipp Reisner	dc97b70801	drbd: Split drbd_alter_sa() into drbd_sync_after_valid() and drbd_sync_after_changed() Preparing RCU for disk_conf Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:42 +01:00
Philipp Reisner	ef5e44a672	drbd: drbd_dew_dev_size() gets the user requests disk_size as argument Preparing RCU for disk_conf Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:41 +01:00
Philipp Reisner	a0095508ca	drbd: Renamed the net_conf_update mutex to conf_update Preparing to use the same mutex for disk_conf updates Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:41 +01:00
Philipp Reisner	934e6138b5	drbd: Removed dead code Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:40 +01:00
Andreas Gruenbacher	b966b5dd8e	drbd: Generate the drbd_set_*_defaults() functions from drbd_genl.h Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:55:38 +01:00
Lars Ellenberg	009ba89db5	drbd: fix schedule in atomic An administrative detach used to request a state change directly to D_DISKLESS, first suspending IO to avoid the last put_ldev() occuring from an endio handler, potentially in irq context. This is not enough on the receiving side (typically secondary), we may miss some peer_req on the way to local disk, which then may do the last put_ldev() from their drbd_peer_request_endio(). This patch makes the detach always go through the intermediate D_FAILED state. We may consider to rename it D_DETACHING. Alternative approach would be to create yet an other work item to be scheduled on the worker, do the destructor work from there, and get the timing right. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:53:01 +01:00
Lars Ellenberg	992d6e91d3	drbd: fix thread stop deadlock There are races where the receiver may be exiting, but still need the worker to process some stuff. Do not wait for the receiver to die from an exiting worker. The receiver must already be dead in case the worker decides to exit. If the receiver was still alive, it may still want to queue work, and do drbd_flush_workqueue() from it's disconnect cleanup code, which would no longer be processed by an exiting worker. This also would deadlock, if the worker was to synchornously wait for the receiver to die. Do not implicitly stop the worker. The worker will only be stopped from configuration context, from conn_reconfig_done(), drbd_adm_down() or drbd_adm_delete_connection(), after making sure the receiver is already stopped. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:53:00 +01:00
Lars Ellenberg	f3dfa40a67	drbd: fix race when forcefully disconnecting If a forced disconnect hits a restarting receiver right after it passed its final "if (C_DISCONNECTING)" test in drbdd_init(), but before it was actually restarted by drbd_thread_setup, we could be left with a connection stuck in C_DISCONNECTING, never reaching C_STANDALONE, which would be necessary to take it down or reconfigure it. Move the last cleanup into w_after_conn_state_ch(), and do an additional state change request in conn_try_disconnect(), just in case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:53:00 +01:00
Andreas Gruenbacher	88104ca458	drbd: Allow to change data-integrity-alg on the fly The main purpose of this is to allow to turn data integrity checking on and off on demand without causing interruptions. Implemented by allocating tconn->peer_integrity_tfm only when receiving a P_PROTOCOL message. l accesses to tconn->peer_integrity_tf happen in worker context, and no further synchronization is necessary. On the sender side, tconn->integrity_tfm is modified under tconn->data.mutex, and a P_PROTOCOL message is sent whenever. All accesses to tconn->integrity_tfm already happen under this mutex. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:52:59 +01:00
Andreas Gruenbacher	4b6ad6d457	drbd: Remove obsolete drbd_crypto_is_hash() We allocate hash transformations with crypto_alloc_hash() which will only return hash algorithms. It is not necessary to reconfirm that we actually got a hash algorithm. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:52:58 +01:00
Andreas Gruenbacher	5b614abe30	drbd: Rename integrity_r_tfm -> peer_integrity_tfm Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:52:58 +01:00
Andreas Gruenbacher	8d412fc6d5	drbd: Rename integrity_w_tfm -> integrity_tfm Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:52:58 +01:00
Lars Ellenberg	b57a1e27ee	drbd: rename variable sc to res_opts sc was short for syncer conf, which does not exist anymore anyways. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:52:56 +01:00
Lars Ellenberg	5ecc72c3b9	drbd: rename variable ndc to new_disk_conf Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:52:54 +01:00
Lars Ellenberg	5979e36155	drbd: on reconfiguration requests, mind the SET_DEFAULTS flag The DRBD_GENL_F_SET_DEFAULTS flag was ignored for drbd_adm_disk_opts() and drbd_adm_net_opts(). Factor out drbd_set_*_defaults() helper functions, and call them appropriately. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:50:38 +01:00
Philipp Reisner	0fd0ea064c	drbd: Consider all crypto options in connect and in net-options So for this was simply not considered after the options have been re-arranged. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:49:08 +01:00
Lars Ellenberg	d9cc6e2318	drbd: fix various disconnecting races If an admin requests disconnect at a time when the state handling already disconnects/reconnects, there have been some races. Make sure to always really stop the network threads before returning success for disconnect. Do not pretend successfull forced disconnect, if the state handling returned an error. Return success from drbd_adm_down() only after all threads are finished. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:49:08 +01:00
Lars Ellenberg	5ee743e92d	drbd: remove useless kobject_uevent from drbd_adm_connect Calling kobject_uevent, which may sleep, from within rcu_read_lock() protected regions is not possible. This particular kobject_uevent also is also wrong. It was supposed to trigger a udev run, just in case something relevant to udev symlink magic has changed, when adjusting runtime re-configurable settings while we still had the "syncer conf". It was improperly placed in connect when we dropped the "syncer conf". The right thing to do is probably to call "udevadm trigger" directly in those cases where drbdadm thinks there was a need to trigger extra udev runs. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:49:07 +01:00
Philipp Reisner	a18e9d1eb0	drbd: Removed the OBJECT_DYING and the CONFIG_PENDING bits superseded by refcounting Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:49:07 +01:00
Philipp Reisner	0ace9dfabe	drbd: Take a reference on tconn when finding a tconn by name Rule #3 of kref.txt Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:49:06 +01:00
Philipp Reisner	9dc9fbb357	drbd: Basic refcounting for drbd_tconn References hold by: * Each (running) drbd thread has a reference on tconn * Each mdev has a referenc on tconn * Beeing in the all_tconn list counts for one reference * Each after_conn_state_chg_work has a reference to tconn Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:49:06 +01:00
Lars Ellenberg	71932efc1c	drbd: allow status dump request all volumes of a specific resource We had drbd_adm_get_status (one single volume), and drbd_adm_get_status_all (dump of all volumes of all resources). This enhances the latter to be able to dump all volumes of just one specific resource. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:49:04 +01:00
Philipp Reisner	91fd4dad64	drbd: Proper locking for updates to net_conf under RCU Removing the get_net_conf()/put_net_conf() functions Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:49:03 +01:00
Philipp Reisner	44ed167da7	drbd: rcu_read_lock() and rcu_dereference() for tconn->net_conf Removing the get_net_conf()/put_net_conf() calls Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:48:59 +01:00
Philipp Reisner	b032b6fa35	drbd: Allow online change of replication protocol only with agreed_pv >= 100 Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:18 +01:00
Philipp Reisner	cd64397c0b	drbd: Check consistency of net options when the get changed online Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:18 +01:00
Philipp Reisner	d3fcb4908d	drbd: protect all idr accesses that might sleep with drbd_cfg_rwsem With this commit the locking for all accesses to IDRs is complete: * Non sleeping read accesses are protected by RCU * sleeping read accesses are protocted by a read lock on drbd_cfg_rwsem * accesses that add anything are protected by a write lock * accesses that remove an object are protoected by a write lock and a call to synchronize_rcu() after it is removed from the IDR and before the object is actually free()ed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:17 +01:00
Philipp Reisner	ef35626284	drbd: Converted drbd_cfg_mutex into drbd_cfg_rwsem Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:17 +01:00
Philipp Reisner	695d08fa94	drbd: rcu_read_[un]lock() for all idr accesses that do not sleep Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:16 +01:00
Philipp Reisner	ff370e5a9e	drbd: drbd_delete_device() takes a struct drbd_conf * now Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:15 +01:00
Andreas Gruenbacher	0c8e36d9b8	drbd: Introduce protocol version 100 headers The 8 byte header finally becomes too small. With the protocol 100 header we have 16 bit for the volume number, proper 32 bit for the data length, and 32 bit for further extensions in the future. Previous versions of drbd are using version 80 headers for all packets short enough for protocol 80. They support both header versions in worker context, but only version 80 headers in asynchronous context. For backwards compatibility, continue to use version 80 headers for short packets before protocol version 100. From protocol version 100 on, use the same header version for all packets. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:10 +01:00
Andreas Gruenbacher	da39fec492	drbd: Remove now-unused int_dig_out buffer Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:09 +01:00
Philipp Reisner	19f83c7661	drbd: Implemented conn_lowest_conn() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:05 +01:00
Philipp Reisner	da9fbc276e	drbd: Introduced a new type union drbd_dev_state Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:04 +01:00
Philipp Reisner	2aebfabb17	drbd: Renamed id_susp(union drbd_state s) to drbd_suspended(struct drbd_conf *) Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:03 +01:00
Philipp Reisner	78bae59b1b	drbd: Introduced drbd_read_state() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:45:03 +01:00
Philipp Reisner	cb703454a2	drbd: Converted drbd_try_outdate_peer() from mdev to tconn Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:44:53 +01:00
Andreas Gruenbacher	22ab6a30b8	drbd: drbd_bm_read() never returns a positive value through drbd_bitmap_io() Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:44:47 +01:00
Philipp Reisner	e90285e0ba	drbd: Fixed conn_lowest_minor It actually returned the lowest volume number. While doing that renamed a few wrongly named variables. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:44:28 +01:00
Lars Ellenberg	f399002e68	drbd: distribute former syncer_conf settings to disk, connection, and resource level This commit breaks the API again. Move per-volume former syncer options into disk_conf. Move per-connection former syncer options into net_conf. Renamed the remainign sync_conf to res_opts Syncer settings have been changeable at runtime, so we need to prepare for these settings to be runtime-changeable in their new home as well. Introduce new configuration operations, and share the netlink attribute between "attach" (create new disk) and "disk-opts" (change options). Same for "connect" and "net-opts". Some fields cannot be changed at runtime, however. Introduce a new flag GENLA_F_INVARIANT to be able to trigger on that in the generated validation and assignment functions. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-08 16:44:20 +01:00
Philipp Reisner	6b75dced00	drbd: conn_khelper() for user mode callbacks for connections Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-04 00:16:32 +01:00
Lars Ellenberg	40cbf085f5	drbd: fix conn_reconfig_start without conn_reconfig_done in drbd_adm_attach If drbd_adm_attach failed early, it left the CONFIG_PENDING bit on, blocking any further conn_reconfig_start on that connection. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-04 00:16:31 +01:00
Lars Ellenberg	85f75dd763	drbd: introduce in-kernel "down" command This greatly simplifies deconfiguration of whole resources. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-04 00:16:23 +01:00
Lars Ellenberg	527f4b24e5	drbd: bail out if a config requrest is over-determined, and not matching We have resources resp. connections, volumes, and minor numbers. A config request may specifies all three of them. If it turns out that the minor belongs to a different connection, or a different volume number in the same connection, that configuration request is invalid. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-04 00:16:21 +01:00
Lars Ellenberg	38f19616d2	drbd: new-connection and new-minor succeed, if the object already exists Follow O_CREAT semantics when creating connection or minor device/volume objects. If we need O_CREAT\|O_EXCL semantics some time down the road, we can add NLM_F_EXCL to the netlink message flags. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-04 00:16:21 +01:00
Lars Ellenberg	cffec5b2fe	drbd: Allow a Diskless Secondary volume to be removed Even if the connection is still established. We should be able to reduce a volume from a replication group, without taking the whole group offline. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-04 00:16:20 +01:00
Lars Ellenberg	543cc10b4c	drbd: drbd_adm_get_status needs to show some more detail We want to see existing connection objects, even if they do not currently have volumes attached. Change the .dumpit variant of drbd_adm_get_status to iterate not over minor devices, but over connections + volumes. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-04 00:16:19 +01:00
Lars Ellenberg	8432b31457	drbd: allow holes in minor and volume id allocation s/idr_get_new/idr_get_new_above/ Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-04 00:16:17 +01:00
Lars Ellenberg	3b98c0c209	drbd: switch configuration interface from connector to genetlink Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-04 00:16:17 +01:00
Lars Ellenberg	a2a3c74f24	drbd: always write bitmap on detach If we detach due to local read-error (which sets a bit in the bitmap), stay Primary, and then re-attach (which re-reads the bitmap from disk), we potentially lost the "out-of-sync" (or, "bad block") information in the bitmap. Always (try to) write out the changed bitmap pages before going diskless. That way, we don't lose the bit for the bad block, the next resync will fetch it from the peer, and rewrite it locally, which may result in block reallocation in some lower layer (or the hardware), and thereby "heal" the bad blocks. If the bitmap writeout errors out as well, we will (again: try to) mark the "we need a full sync" bit in our super block, if it was a READ error; writes are covered by the activity log already. If that superblock does not make it to disk either, we are sorry. Maybe we just lost an entire disk or controller (or iSCSI connection), and there actually are no bad blocks at all, so we don't need to re-fetch from the peer, there is no "auto-healing" necessary. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-10-30 08:39:18 +01:00
Lars Ellenberg	06f10adbdb	drbd: prepare for more than 32 bit flags - struct drbd_conf { ... unsigned long flags; ... } + struct drbd_conf { ... unsigned long drbd_flags[N]; ... } And introduce wrapper functions for test/set/clear bit operations on this member. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-10-30 08:39:18 +01:00
Lars Ellenberg	02b91b5526	drbd: introduce stop-sector to online verify We now can schedule only a specific range of sectors for online verify, or interrupt a running verify without interrupting the connection. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-10-30 08:39:01 +01:00
Philipp Reisner	9f2247bb9b	drbd: Protect accesses to the uuid set with a spinlock There is at least the worker context, the receiver context, the context of receiving netlink packts and processes reading a sysfs attribute that access the uuids. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-10-30 08:39:01 +01:00
Philipp Reisner	d1aa4d04da	drbd: Write all pages of the bitmap after an online resize We need to write the whole bitmap after we moved the meta data due to an online resize operation. With the support for one peta byte devices bitmap IO was optimized to only write out touched pages. This optimization must be turned off when writing the bitmap after an online resize. This issue was introduced with drbd-8.3.10. The impact of this bug is that after an online resize, the next resync could become larger than expected. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-08-16 17:17:35 +02:00
Lars Ellenberg	db141b2f42	drbd: fix max_bio_size to be unsigned We capped our max_bio_size respectively max_hw_sectors with min_t(int, lower level limit, our limit); unfortunately, some drivers, e.g. the kvm virtio block driver, initialize their limits to "-1U", and that is of course a smaller "int" value than our limit. Impact: we started to request 16 MB resync requests, which lead to protocol error and a reconnect loop. Fix all relevant constants and parameters to be unsigned int. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 15:14:00 +02:00
Lars Ellenberg	7ee1fb93f3	drbd: flush drbd work queue before invalidate/invalidate remote If you do back to back wait-sync/invalidate on a Primary in a tight loop, during application IO load, you could trigger a race: kernel: block drbd6: FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending? Fix this by changing the order of the drbd_queue_work() and the wake_up() in dec_ap_pending(), and adding the additional drbd_flush_workqueue() before requesting the full sync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:15:58 +02:00
Lars Ellenberg	0029d62434	drbd: do not reset rs_pending_cnt too early Fix asserts like block drbd0: in got_BlockAck:4634: rs_pending_cnt = -35 < 0 ! We reset the resync lru cache and related information (rs_pending_cnt), once we successfully finished a resync or online verify, or if the replication connection is lost. We also need to reset it if a resync or online verify is aborted because a lower level disk failed. In that case the replication link is still established, and we may still have packets queued in the network buffers which want to touch rs_pending_cnt. We do not have any synchronization mechanism to know for sure when all such pending resync related packets have been drained. To avoid this counter to go negative (and violate the ASSERT that it will always be >= 0), just do not reset it when we lose a disk. It is good enough to make sure it is re-initialized before the next resync can start: reset it when we re-attach a disk. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:09:53 +02:00
Lars Ellenberg	c2ba686f35	drbd: report congestion if we are waiting for some userland callback If the drbd worker thread is synchronously waiting for some userland callback, we don't want some casual pageout to block on us. Have drbd_congested() report congestion in that case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:07:18 +02:00

1 2 3 4 5 ...

373 Commits (c6ae4c04a861dac4d174fd3e90128d5232c8661b)