1
0
Fork 0
Commit Graph

199 Commits (4246a0b63bd8f56a1469b12eafeb875b1041a451)

Author SHA1 Message Date
Steven Whitehouse 48c2b61361 GFS2: Add commit= mount option
It has always been possible to adjust the gfs2 log commit
interval, but only from the sysfs interface. This adds a
mount option, commit=<nn>, which will be familar to ext3
users.

The sysfs interface continues to be available as well, although
this might be removed in the future.

Also this patch cleans up some duplicated structures in the GFS2
sysfs code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-13 14:49:48 +01:00
Al Viro e24977d45f Reduce path_lookup() abuses
... use kern_path() where possible

[folded a fix from rdd]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09 10:49:42 -04:00
Nikanth Karthikesan b1fffc9ca6 gfs2: Remove code handling bio_alloc failure with __GFP_WAIT
Remove code handling bio_alloc failure with __GFP_WAIT.
GFP_NOFS implies __GFP_WAIT.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15 12:10:13 +02:00
Steven Whitehouse 6bac243f07 GFS2: Clean up of glops.c
This cleans up a number of bits of code mostly based in glops.c.
A couple of simple functions have been merged into the callers
to make it more obvious what is going on, the mysterious raising
of i_writecount around the truncate_inode_pages() call has been
removed. The meta_go_* operations have been renamed rgrp_go_*
since that is the only lock type that they are used with.

The unused argument of gfs2_read_sb has been removed. Also
a bug has been fixed where a check for the rindex inode was
in the wrong callback. More comments are added, and the
debugging code is improved too.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:27 +00:00
Steven Whitehouse 02e3cc70ec GFS2: Expose UUID via sysfs/uevent
Since we have a UUID, we ought to expose it to the user via sysfs
and uevents. We already have the fs name in both of these places
(a combination of the lock proto and lock table name) so if we add
the UUID as well, we have a full set.

For older filesystems (i.e. those created before mkfs.gfs2 was writing
UUIDs by default) the sysfs file will appear zero length, and no UUID
env var will be added to the uevents.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:21 +00:00
Steven Whitehouse e7c8707ea2 GFS2: Fix error path ref counting for root inode
We were keeping hold of an extra ref to the root inode in one
of the error paths, that resulted in a hang.

Reported-by: Nate Straz <nstraz@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Tested-by: Robert Peterson <rpeterso@redhat.com>
2009-03-24 11:21:17 +00:00
Steven Whitehouse f057f6cdf6 GFS2: Merge lock_dlm module into GFS2
This is the big patch that I've been working on for some time
now. There are many reasons for wanting to make this change
such as:
 o Reducing overhead by eliminating duplicated fields between structures
 o Simplifcation of the code (reduces the code size by a fair bit)
 o The locking interface is now the DLM interface itself as proposed
   some time ago.
 o Fewer lookups of glocks when processing replies from the DLM
 o Fewer memory allocations/deallocations for each glock
 o Scope to do further optimisations in the future (but this patch is
   more than big enough for now!)

Please note that (a) this patch relates to the lock_dlm module and
not the DLM itself, that is still a separate module; and (b) that
we retain the ability to build GFS2 as a standalone single node
filesystem with out requiring the DLM.

This patch needs a lot of testing, hence my keeping it I restarted
my -git tree after the last merge window. That way, this has the maximum
exposure before its merged. This is (modulo a few minor bug fixes) the
same patch that I've been posting on and off the the last three months
and its passed a number of different tests so far.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:14 +00:00
Steven Whitehouse 22077f57de GFS2: Remove "double" locking in quota
We only really need a single spin lock for the quota data, so
lets just use the lru lock for now.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Abhijith Das <adas@redhat.com>
2009-03-24 11:21:13 +00:00
Abhijith Das 0a7ab79c5b GFS2: change gfs2_quota_scan into a shrinker
Deallocation of gfs2_quota_data objects now happens on-demand through a
shrinker instead of routinely deallocating through the quotad daemon.

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:12 +00:00
Steven Whitehouse 6f04c1c7fe GFS2: Fix remount argument parsing
The following patch fixes an issue relating to remount and argument
parsing. After this fix is applied, remount becomes atomic in that
it either succeeds changing the mount to the new state, or it fails
and leaves it in the old state. Previously it was possible for the
parsing of options to fail part way though and for the fs to be left
in a state where some of the new arguments had been applied, but some
had not.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:10 +00:00
Steven Whitehouse 88a19ad066 GFS2: Fix use-after-free bug on umount (try #2)
This should solve the issue with the previous attempt at fixing this.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:19 +00:00
Steven Whitehouse fefc03bfed Revert "GFS2: Fix use-after-free bug on umount"
This reverts commit 78802499912f1ba31ce83a94c55b5a980f250a43.

The original patch is causing problems in relation to order of
operations at umount in relation to jdata files. I need to fix
this a different way.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:18 +00:00
Steven Whitehouse 3af165ac4d GFS2: Fix use-after-free bug on umount
There was a use-after-free with the GFS2 super block during
umount. This patch moves almost all of the umount code from
->put_super into ->kill_sb, the only bit that cannot be moved
being the glock hash clearing which has to remain as ->put_super
due to umount ordering requirements. As a result its now obvious
that the kfree is the final operation, whereas before it was
hidden in ->put_super.

Also gfs2_jindex_free is then only referenced from a single file
so thats moved and marked static too.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:14 +00:00
Steven Whitehouse b52896813c GFS2: Fix bug in gfs2_lock_fs_check_clean()
gfs2_lock_fs_check_clean() should not be calling gfs2_jindex_hold()
since it doesn't work like rindex hold, despite the comment. That
allows gfs2_jindex_hold() to be moved into ops_fstype.c where it
can be made static.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:11 +00:00
Steven Whitehouse 97cc1025b1 GFS2: Kill two daemons with one patch
This patch removes the two daemons, gfs2_scand and gfs2_glockd
and replaces them with a shrinker which is called from the VM.

The net result is that GFS2 responds better when there is memory
pressure, since it shrinks the glock cache at the same rate
as the VFS shrinks the dcache and icache. There are no longer
any time based criteria for shrinking glocks, they are kept
until such time as the VM asks for more memory and then we
demote just as many glocks as required.

There are potential future changes to this code, including the
possibility of sorting the glocks which are to be written back
into inode number order, to get a better I/O ordering. It would
be very useful to have an elevator based workqueue implementation
for this, as that would automatically deal with the read I/O cases
at the same time.

This patch is my answer to Andrew Morton's remark, made during
the initial review of GFS2, asking why GFS2 needs so many kernel
threads, the answer being that it doesn't :-) This patch is a
net loss of about 200 lines of code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:09 +00:00
Steven Whitehouse 9ac1b4d9b6 GFS2: Move gfs2_recoverd into recovery.c
By moving gfs2_recoverd, we can make an additional function static
and it also leaves only (the already scheduled for removal) gfs2_glockd
in daemon.c.

At the same time the declaration of gfs2_quotad is moved to quota.h
to reflect the new location of gfs2_quotad in a previous patch. Also
the recovery.h and quota.h headers are cleaned up.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:07 +00:00
Steven Whitehouse 813e0c46c9 GFS2: Fix "truncate in progress" hang
Following on from the recent clean up of gfs2_quotad, this patch moves
the processing of "truncate in progress" inodes from the glock workqueue
into gfs2_quotad. This fixes a hang due to the "truncate in progress"
processing requiring glocks in order to complete.

It might seem odd to use gfs2_quotad for this particular item, but
we have to use a pre-existing thread since creating a thread implies
a GFP_KERNEL memory allocation which is not allowed from the glock
workqueue context. Of the existing threads, gfs2_logd and gfs2_recoverd
may deadlock if used for this operation. gfs2_scand and gfs2_glockd are
both scheduled for removal at some (hopefully not too distant) future
point. That leaves only gfs2_quotad whose workload is generally fairly
light and is easily adapted for this extra task.

Also, as a result of this change, it opens the way for a future patch to
make the reading of the inode's information asynchronous with respect to
the glock workqueue, which is another improvement that has been on the list
for some time now.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:06 +00:00
Steven Whitehouse 37b2c8377c GFS2: Clean up & move gfs2_quotad
This patch is a clean up of gfs2_quotad prior to giving it an
extra job to do in addition to the current portfolio of updating
the quota and statfs information from time to time.

As a result it has been moved into quota.c allowing one of the
functions it calls to be made static. Also the clean up allows
the two existing functions to have separate timeouts and also
to coexist with its future role of dealing with the "truncate in
progress" inode flag.

The (pointless) setting of gfs2_quotad_secs is removed since we
arrange to only wake up quotad when one of the two timers expires.

In addition the struct gfs2_quota_data is moved into a slab cache,
mainly for easier debugging. It should also be possible to use
a shrinker in the future, rather than the current scheme of scanning
the quota data entries from time to time.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:05 +00:00
Steven Whitehouse c9e9888677 GFS2: Move i_size from gfs2_dinode_host and rename it to i_disksize
This patch moved the i_size field from the gfs2_dinode_host and
following the ext3 convention renames it i_disksize.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:38:58 +00:00
Steven Whitehouse b276058371 GFS2: Rationalise header files
Move the contents of some headers which contained very
little into more sensible places, and remove the original
header files. This should make it easier to find things.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:38:48 +00:00
Steven Whitehouse 719ee34467 GFS2: high time to take some time over atime
Until now, we've used the same scheme as GFS1 for atime. This has failed
since atime is a per vfsmnt flag, not a per fs flag and as such the
"noatime" flag was not getting passed down to the filesystems. This
patch removes all the "special casing" around atime updates and we
simply use the VFS's atime code.

The net result is that GFS2 will now support all the same atime related
mount options of any other filesystem on a per-vfsmnt basis. We do lose
the "lazy atime" updates, but we gain "relatime". We could add lazy
atime to the VFS at a later date, if there is a requirement for that
variant still - I suspect relatime will be enough.

Also we lose about 100 lines of code after this patch has been applied,
and I have a suspicion that it will speed things up a bit, even when
atime is "on". So it seems like a nice clean up as well.

From a user perspective, everything stays the same except the loss of
the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very
least, and to be honest I don't think anybody ever used it) and that a
number of options which were ignored before now work correctly.

Please let me know if you've got any comments. I'm pushing this out
early so that you can all see what my plans are.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-18 13:53:59 +01:00
Abhijith Das acd2c8aa02 GFS2: GFS2 will panic if you misspell any mount options
The gfs2 superblock pointer is NULL after a failed mount. When control
eventually goes to gfs2_kill_sb, we dereference this NULL pointer. This
patch ensures that the gfs2 superblock pointer is not NULL before being
dereferenced in gfs2_kill_sb.

Signed-off-by:   Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-15 16:08:32 +01:00
Steven Whitehouse 9b8df98fc8 GFS2: Fix metafs mounts
This patch is intended to fix the issues reported in bz #457798. Instead
of having the metafs as a separate filesystem, it becomes a second root
of gfs2. As a result it will appear as type gfs2 in /proc/mounts, but it
is still possible (for backwards compatibility purposes) to mount it as
type gfs2meta. A new mount flag "meta" is introduced so that its possible
to tell the two cases apart in /proc/mounts.

As a result it becomes possible to mount type gfs2 with -o meta and
get the same result as mounting type gfs2meta. So it is possible to
mount just the metafs on its own. Currently if you do this, its then
impossible to mount the "normal" root of the gfs2 filesystem without
first unmounting the metafs root. I'm not sure if thats a feature or
a bug :-)

Either way, this is a great improvement on the previous scheme and I've
verified that it works ok with bind mounts on both the "normal" root
and the metafs root in various combinations.

There were also a bunch of functions in super.c which didn't belong there,
so this moves them into ops_fstype.c where they can be static. Hopefully
the mount/umount sequence is now more obvious as a result.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
2008-08-13 09:59:40 +01:00
Steven Whitehouse 9cabcdbd46 [GFS2] Replace rgrp "recent list" with mru list
This patch removes the "recent list" which is used during allocation
and replaces it with the (already existing) mru list used during
deletion. The "recent list" was not a true mru list leading to a number
of inefficiencies including a "next" function which made scanning the
list an order N^2 operation wrt to the number of list elements.

This should increase allocation performance with large numbers of rgrps.
Its also a useful preparation and cleanup before some further changes
which are planned in this area.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-07-10 15:54:12 +01:00
Steven Whitehouse 1bdad60633 [GFS2] Remove remote lock dropping code
There are several reasons why this is undesirable:

 1. It never happens during normal operation anyway
 2. If it does happen it causes performance to be very, very poor
 3. It isn't likely to solve the original problem (memory shortage
    on remote DLM node) it was supposed to solve
 4. It uses a bunch of arbitrary constants which are unlikely to be
    correct for any particular situation and for which the tuning seems
    to be a black art.
 5. In an N node cluster, only 1/N of the dropped locked will actually
    contribute to solving the problem on average.

So all in all we are better off without it. This also makes merging
the lock_dlm module into GFS2 a bit easier.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-06-27 09:39:44 +01:00
Steven Whitehouse 048bca2237 [GFS2] No lock_nolock
This patch merges the lock_nolock module into GFS2 itself. As well as removing
some of the overhead of the module, it also means that its now impossible to
build GFS2 without a lock module (which would be a pointless thing to do
anyway).

We also plan to merge lock_dlm into GFS2 in the future, but that is a more
tricky task, and will therefore be a separate patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Teigland <teigland@redhat.com>
2008-06-27 09:39:28 +01:00
Jean Delvare 00377d8e38 [GFS2] Prefer strlcpy() over snprintf()
strlcpy is faster than snprintf when you don't use the returned value.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-05-12 08:57:11 +01:00
Benjamin Marzinski 58e9fee13e [GFS2] Invalidate cache at correct point
GFS2 wasn't invalidating its cache before it called into the lock manager
with a request that could potentially drop a lock.  This was leaving a
window where the lock could be actually be held by another node, but the
file's page cache would still appear valid, causing coherency problems.
This patch moves the cache invalidation to before the lock manager call
when dropping a lock. It also adds the option to the lock_dlm lock
manager to not use conversion mode deadlock avoidance, which, on a
conversion from shared to exclusive, could internally drop the lock, and
then reacquire in. GFS2 now asks lock_dlm to not do this.  Instead, GFS2
manually drops the lock and reacquires it.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:44 +01:00
Steven Whitehouse 860b25d4a9 [GFS2] Remove drop of module ref where not needed
In an earlier patch "[GFS2] fix file_system_type leak on gfs2meta mount"
we removed the code to grab a ref to the module which was not needed
(since we know that the module cannot be unloaded at that time) so
this patch removes the code to drop that reference.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:33 +01:00
Christoph Hellwig 7dc2cf1c8f [GFS2] fix file_system_type leak on gfs2meta mount
get_gfs2_sb does a get_fs_type without doing a put_filesystem and
thus leaking a file_system_type reference everytime it's called.

Just use gfs2_fs_type directly instead of doing the lookup and thus
fix the problem.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:17 +01:00
Bob Peterson cf45b752c9 [GFS2] Remove rgrp and glock version numbers
This patch further reduces GFS2's memory requirements by
eliminating the 64-bit version number fields in lieu of
a couple bits.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:29 +01:00
Steven Whitehouse da755fdb41 [GFS2] Remove lm.[ch] and distribute content
The functions in lm.c were just wrappers which were mostly
only used in one other file. By moving the functions to
the files where they are being used, they can be marked
static and also this will usually result in them being inlined
since they are often only used from one point in the code.

A couple of really trivial functions have been inlined by hand
into the function which called them as it makes the code clearer
to do that.

We also gain from one fewer function call in the glock lock and
unlock paths.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:26 +01:00
Jan Blunck 1d957f9bf8 Introduce path_put()
* Add path_put() functions for releasing a reference to the dentry and
  vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck 4ac9137858 Embed a struct path into struct nameidata instead of nd->{dentry,mnt}
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
  <dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
  struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
   text    data     bss     dec     hex filename
5321639  858418  715768 6895825  6938d1 vmlinux

with patch series:
   text    data     bss     dec     hex filename
5320026  858418  715768 6894212  693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Bob Peterson 0811a127cb [GFS2] Initialize extent_list earlier
Here is a patch for the latest upstream GFS2 code:
The journal extent map needs to be initialized sooner than it
currently is.  Otherwise failed mount attempts (e.g. not enough
journals, etc.) may panic trying to access the uninitialized list.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:17:04 +00:00
Bob Peterson fa3742fa85 [GFS2] Minor correction
This is a small correction to my previously posted patch1.
It just changes a divide to a shift.  It's faster and doesn't
introduce odd dependencies on 32-bit compiles.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:15:37 +00:00
Bob Peterson c3f60b6e3a [GFS2] Eliminate the no longer needed sd_statfs_mutex
This patch eliminates the unneeded sd_statfs_mutex mutex but preserves
the ordering as discussed.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:15:16 +00:00
Bob Peterson da6dd40d59 [GFS2] Journal extent mapping
This patch saves a little time when gfs2 writes to the journals by
keeping a mapping between logical and physical blocks on disk.
That's better than constantly looking up indirect pointers in
buffers, when the journals are several levels of indirection
(which they typically are).

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:11:46 +00:00
Steven Whitehouse fd041f0b40 [GFS2] Use atomic_t for journal free blocks counter
This patch changes the counter which keeps track of the free
blocks in the journal to an atomic_t in preparation for the
following patch which will update the log reservation code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:07:54 +00:00
Steven Whitehouse 2bcd610d2f [GFS2] Don't add glocks to the journal
The only reason for adding glocks to the journal was to keep track
of which locks required a log flush prior to release. We add a
flag to the glock to allow this check to be made in a simpler way.

This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64)
and means that we can avoid extra work during the journal flush.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:07:52 +00:00
Benjamin Marzinski 7a9f53b3c1 [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed
There is a possible deadlock between two processes on the same node, where one
process is deleting an inode, and another process is looking for allocated but
unused inodes to delete in order to create more space.

process A does an iput() on inode X, and it's i_count drops to 0. This causes
iput_final() to be called, which puts an inode into state I_FREEING at
generic_delete_inode(). There no point between when iput_final() is called, and
when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is
set, no other process on that node can successfully look up that inode until
the delete finishes.

process B locks the the resource group for the same inode in get_local_rgrp(),
which is called by gfs2_inplace_reserve_i()

process A tries to lock the resource group for the inode in
gfs2_dinode_dealloc(), but it's already locked by process B

process B waits in find_inode for the inode to have the I_FREEING state cleared.

Deadlock.

This patch solves the problem by adding an alternative to gfs2_iget(),
gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING
state.o The alternate test function is just like the original one, except that
it fails if the inode is being freed, and sets a skipped flag. The alternate
set function is just like the original, except that it fails if the skipped
flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of
gfs2_iget().

Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:56:29 +01:00
Steven Whitehouse 16615be18c [GFS2] Clean up journaled data writing
This patch cleans up the code for writing journaled data into the log.
It also removes the need to allocate a small "tag" structure for each
block written into the log. Instead we just keep count of the outstanding
I/O so that we can be sure that its all been written at the correct time.
Another result of this patch is that a number of ll_rw_block() calls
have become submit_bh() calls, closing some races at the same time.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:56:24 +01:00
Steven Whitehouse d7b616e252 [GFS2] Clean up ordered write code
The following patch removes the ordered write processing from
databuf_lo_before_commit() and moves it to log.c. This has the effect of
greatly simplyfying databuf_lo_before_commit() and well as potentially
making the ordered write code more efficient.

As a side effect of this, its now possible to remove ordered buffers
from the ordered buffer list at any time, so we now make use of this in
invalidatepage and releasepage to ensure timely release of these
buffers.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:56:03 +01:00
Abhijith Das d1e2777d4f [GFS2] panic after can't parse mount arguments
When you try to mount gfs2 with -o garbage, the mount fails and the gfs2
superblock is deallocated and becomes NULL. The vfs comes around later
on and calls gfs2_kill_sb. At this point the hidden gfs2 superblock
pointer (sb->s_fs_info) is NULL and dereferencing it through
gfs2_meta_syncfs causes the panic. (the other function call to
gfs2_delete_debugfs_file() succeeds because this function already checks
for a NULL pointer)

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:55:46 +01:00
Steven Whitehouse bb3b0e3df5 [GFS2] Clean up invalidatepage/releasepage
This patch fixes some bugs relating to journaled data files by cleaning
up the gfs2_invalidatepage() and gfs2_releasepage() functions. We now
never block during gfs2_releasepage(), instead we always either release
or refuse to release depending on the status of the buffers.

This fixes Red Hat bugzillas #248969 and #252392.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
2007-10-10 08:55:29 +01:00
Denis Cheng 34eaae398e [GFS2] fixed a NULL pointer assignment BUG
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:55:24 +01:00
Denis Cheng 5d35e31f43 [GFS2] better code for translating characters
the original code could work, but I think this code could work better.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:55:20 +01:00
Denis Cheng 2d3ba1ea97 [GFS2] unneeded typecast
sb->s_fs_info is a void pointer, thus the type cast is not needed.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:55:17 +01:00
Denis Cheng adb4ec13cd [GFS2] use list_for_each_entry instead
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:55:15 +01:00
Bob Peterson 75be73a824 [GFS2] Ensure journal file cache is flushed after recovery
This is for bugzilla bug #248176: GFS2: invalid metadata block

Patches 1 thru 3 were accepted upstream, but there were problems
with 4 and 5.  Those issues have been resolved and now the recovery
tests are passing without errors.  This code has gone through
41 * 3 successful gfs2 recovery tests before it hit an
unrelated (openais) problem.  I'm continuing to test it.

This is a complete rewrite of patch 5 for bug #248176, written by
Steve Whitehouse.  This is referred to in the bugzilla record as
"new 6" and "a different solution".

The problem was that the journal inodes, although protected by
a glock, were not synched with the other nodes because they don't
use the inode glock synch operations (i.e. no "glops" were defined).
Therefore, journal recovery on a journal-recovering node were causing
the blocks to get out of sync with the node that was actually trying
to use that journal as it comes back up from a reboot.

There are two possible solutions: (1) To make the journals use the
normal inode glock sync operations, or (2) To make the journal
operations take effect immediately (i.e. no caching).  Although
option 1 works, it turns out to be a lot more code.  Steve opted
for option 2, which is much simpler and therefore less prone to
regression errors.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

--
2007-10-10 08:55:12 +01:00
Steven Whitehouse 8fbbfd214c [GFS2] Reduce number of gfs2_scand processes to one
We only need a single gfs2_scand process rather than the one
per filesystem which we had previously. As a result the parameter
determining the frequency of gfs2_scand runs becomes a module
parameter rather than a mount parameter as it was before.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:55:08 +01:00
Denis Cheng ca5a939b33 [GFS2] use the declaration of gfs2_dops in the header file instead
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10 08:55:05 +01:00
Wendy Cheng bb9bcf0616 [GFS2] Obtaining no_formal_ino from directory entry
GFS2 lookup code doesn't ask for inode shared glock. This implies during
in-memory inode creation for existing file, GFS2 will not disk-read in
the inode contents. This leaves no_formal_ino un-initialized during
lookup time. The un-initialized no_formal_ino is subsequently encoded
into file handle. Clients will get ESTALE error whenever it tries to
access these files.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:24:08 +01:00
Robert Peterson b35997d448 [GFS2] Can't mount GFS2 file system on AoE device
This patch fixes bug 243131: Can't mount GFS2 file system on AoE device.
When using AoE devices with lock_nolock, there is no locking table, so
gfs2 (and gfs1) uses the superblock s_id.  This turns out to be the device
name in some cases.  In the case of AoE, the device contains a slash,
(e.g. "etherd/e1.1p2") which is an invalid character when we try to
register the table in sysfs.  This patch replaces the "/" with underscore.
Rather than add a new variable to the stack, I'm just reusing a (char *)
variable that's no longer used: table.

This code has been tested on the failing system using a RHEL5 patch.
The upstream code was tested by using gfs2_tool sb to interject a "/"
into the table name of a clustered gfs2 file system.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:23:24 +01:00
Steven Whitehouse 4bd91ba181 [GFS2] Add nanosecond timestamp feature
This adds a nanosecond timestamp feature to the GFS2 filesystem. Due
to the way that the on-disk format works, older filesystems will just
appear to have this field set to zero. When mounted by an older version
of GFS2, the filesystem will simply ignore the extra fields so that
it will again appear to have whole second resolution, so that its
trivially backward compatible.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:23:12 +01:00
Steven Whitehouse bb8d8a6f54 [GFS2] Fix sign problem in quota/statfs and cleanup _host structures
This patch fixes some sign issues which were accidentally introduced
into the quota & statfs code during the endianess annotation process.
Also included is a general clean up which moves all of the _host
structures out of gfs2_ondisk.h (where they should not have been to
start with) and into the places where they are actually used (often only
one place). Also those _host structures which are not required any more
are removed entirely (which is the eventual plan for all of them).

The conversion routines from ondisk.c are also moved into the places
where they are actually used, which for almost every one, was just one
single place, so all those are now static functions. This also cleans up
the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.

The net result is a reduction of about 100 lines of code, many functions
now marked static plus the bug fixes as mentioned above. For good
measure I ran the code through sparse after making these changes to
check that there are no warnings generated.

This fixes Red Hat bz #239686

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:23:10 +01:00
Steven Whitehouse dbb7cae2a3 [GFS2] Clean up inode number handling
This patch cleans up the inode number handling code. The main difference
is that instead of looking up the inodes using a struct gfs2_inum_host
we now use just the no_addr member of this structure. The tests relating
to no_formal_ino can then be done by the calling code. This has
advantages in that we want to do different things in different code
paths if the no_formal_ino doesn't match. In the NFS patch we want to
return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
no_formal_ino doesn't match and thus we can withdraw in this case.

In order to later fix bz #201012, we need to be able to look up an inode
without knowing no_formal_ino, as the only information that is known to
us is the on-disk location of the inode in question.

This patch will also help us to fix bz #236099 at a later date by
cleaning up a lot of the code in that area.

There are no user visible changes as a result of this patch and there
are no changes to the on-disk format either.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09 08:22:24 +01:00
Robert Peterson 5f8820960c [GFS2] lockdump improvements
The patch below consists of the following changes (in code order):

1. I fixed a minor compiler warning regarding the printing of
   a kernel symbol address.
2. I implemented a suggestion from Dave Teigland that moves
   the debugfs information for gfs2 into a subdirectory so
   we can easily expand our use of debugfs in the future.
   The current code keeps the glock information in:
   /debug/gfs2/<fs>
   With the patch, the new code keeps the glock information in:
   /debug/gfs2/<fs>/glock
   That will allow us to create more debugfs files in the future.
3. This fixes a bug whereby a failed mount attempt causes the
   debugfs file to not be deleted.  Failed mount attempts should
   always clean up after themselves, including deleting the
   debugfs file and/or directory.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:11:33 +01:00
Robert Peterson 7c52b166c5 [GFS2] Add gfs2_tool lockdump support to gfs2 (bz 228540)
The attached patch resolves bz 228540.  This adds the capability
for gfs2 to dump gfs2 locks through the debugfs file system.
This used to exist in gfs1 as "gfs_tool lockdump" but it's missing from
gfs2 because all the ioctls were stripped out.  Please see the bugzilla
for more history about the fix.  This patch is also attached to the bugzilla
record.

The patch is against Steve Whitehouse's latest nmw git tree kernel
(2.6.21-rc1) and has been tested on system trin-10.

Signed-off-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-05-01 09:10:29 +01:00
Richard Fearn d5a6751b32 [GFS2] add newline to printk message
Patch for the 2.6.20 stable tree that adds a missing newline to one of
the printk messages in fs/gfs2/ops_fstype.c.

Signed-off-by: Richard Fearn <richardfearn@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:57:10 -05:00
David Chinner f73ca1b76c [PATCH] Revert bd_mount_mutex back to a semaphore
Revert bd_mount_mutex back to a semaphore so that xfs_freeze -f /mnt/newtest;
xfs_freeze -u /mnt/newtest works safely and doesn't produce lockdep warnings.

(XFS unlocks the semaphore from a different task, by design.  The mutex
code warns about this)

Signed-off-by: Dave Chinner <dgc@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-11 18:18:21 -08:00
Al Viro 629a21e7ec [GFS2] split and annotate gfs2_inum
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30 10:33:32 -05:00
Adrian Bunk bbbe451273 [GFS2] fs/gfs2/ops_fstype.c:fill_super_meta(): fix NULL dereference
Don't dereference new->s_root when we do know it's NULL.

Spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20 09:15:57 -04:00
Adrian Bunk b0cb66955f [GFS2] fs/gfs2/ops_fstype.c:gfs2_get_sb_meta(): remove unused variable
The Coverity checker spotted this unused variable.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-20 09:15:19 -04:00
Steven Whitehouse 3cf1e7bed4 [GFS2] Remove duplicate sb reading code
For some reason we had two different sets of code for reading in the
superblock. This removes one of them in favour of the other. Also we
don't need the temporary buffer for the sb since we already have one
in the gfs2 sb itself.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 11:49:41 -04:00
Steven Whitehouse 907b9bceb4 [GFS2/DLM] Fix trailing whitespace
As per Andrew Morton's request, removed trailing whitespace.

Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-25 09:26:04 -04:00
Fabio Massimo Di Nitto 7d308590ae [GFS2] Export lm_interface to kernel headers
lm_interface.h has a few out of the tree clients such as GFS1
and userland tools.

Right now, these clients keeps a copy of the file in their build tree
that can go out of sync.

Move lm_interface.h to include/linux, export it to userland and
clean up fs/gfs2 to use the new location.

Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-19 08:45:18 -04:00
Steven Whitehouse a2c4580797 [GFS2] vfree should be kfree
This was missed in an earlier patch when changing over from vmalloc
to kmalloc for the superblock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-08 10:13:03 -04:00
Steven Whitehouse b9201ce9a8 [GFS2] Forgot to remove unused include vmalloc.h
Excatly as the subject line says.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 14:46:39 -04:00
Steven Whitehouse 85d1da67f7 [GFS2] Move glock hash table out of superblock
There are several reasons why we want to do this:
 - Firstly its large and thus we'll scale better with multiple
   GFS2 fs mounted at the same time
 - Secondly its easier to scale its size as required (thats a plan
   for later patches)
 - Thirdly, we can use kzalloc rather than vmalloc when allocating
   the superblock (its now only 4888 bytes)
 - Fourth its all part of my plan to eventually be able to use RCU
   with the glock hash.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-07 14:40:21 -04:00
Steven Whitehouse 26c1a57412 [GFS2] More code style updates
As per Jan Engelhardt's fifth email. This has most of the changes
recommended, which is the removal of casts which are not required,
some indenting fixes and similar.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 15:32:10 -04:00
Steven Whitehouse a91ea69ffd [GFS2] Align all labels against LH side
This makes everything consistent.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-04 12:04:26 -04:00
Steven Whitehouse e9fc2aa091 [GFS2] Update copyright, tidy up incore.h
As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this
updates the copyright message to say "version" in full rather than
"v.2". Also incore.h has been updated to remove forward structure
declarations which are not required.

The gfs2_quota_lvb structure has now had endianess annotations added
to it. Also quota.c has been updated so that we now store the
lvb data locally in endian independant format to avoid needing
a structure in host endianess too. As a result the endianess
conversions are done as required at various points and thus the
conversion routines in lvb.[ch] are no longer required. I've
moved the one remaining constant in lvb.h thats used into lm.h
and removed the unused lvb.[ch].

I have not changed the HIF_ constants. That is left to a later patch
which I hope will unify the gh_flags and gh_iflags fields of the
struct gfs2_holder.

Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-09-01 11:05:15 -04:00
Abhijith Das 8638460540 [GFS2] Allow mounting of gfs2 and gfs2meta at the same time
This patch allows the simultaneous mounting of gfs2meta and gfs2
filesystems. A restriction however is that a gfs2meta fs may only be
mounted if its corresponding gfs2 filesystem is also mounted. Also, a
gfs2 filesystem cannot be unmounted before its gfs2meta filesystem.

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-25 17:19:55 -04:00
Steven Whitehouse a2242db090 [GFS2] Speed up scanning of glocks
I noticed the gfs2_scand seemed to be taking a lot of CPU,
so in order to cut that down a bit, here is a patch. Firstly
the type of a glock is a constant during its lifetime, so that
its possible to check this without needing locking. I've moved
the (common) case of testing for an inode glock outside of
the glmutex lock.

Also there was a mutex left over from when the glock cache was
master of the inode cache. That isn't required any more so I've
removed that too.

There is probably scope for further speed ups in the future
in this area.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-24 17:03:05 -04:00
Russell Cattelan 8872187780 [GFS2] Fix a couple of refcount leaks.
recovery.c add a brelse to deal with gfs2_replay_read_block being called
twice on the same block.

add a dput to drop the ref count on the root inode.
This was causing lingering glocks and thus causing
a mount failure to hang.

Fix a endian conversion macro that was was swizzling
16bits when it should have been swizzling 32.

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-08-10 17:18:59 -04:00
Steven Whitehouse f45b7ddd2b [GFS2] Use a bio to read the superblock
This means that we don't need to create a special inode just to contain
a struct address_space in order to read a single disk block. Instead
we read the disk block directly. Its slightly faster, and uses slightly
less memory, but the real reason for doing this is that it removes a
special case from the glock code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-27 13:53:53 -04:00
Steven Whitehouse ecb1460dc4 [GFS2] Make GFS2 work with lock validator
Change our one existing old-style lock initialiser to a new-style
one. This allows the lock validator to work as intended.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-05 10:41:39 -04:00
Andrew Morton ccd6efd0cd [patch 1/1] gfs2: get_sb_dev() fix
Update GFS2 for dhowells API changes.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-07-03 11:23:09 -04:00
Steven Whitehouse feaa7bba02 [GFS2] Fix unlinked file handling
This patch fixes the way we have been dealing with unlinked,
but still open files. It removes all limits (other than memory
for inodes, as per every other filesystem) on numbers of these
which we can support on GFS2. It also means that (like other
fs) its the responsibility of the last process to close the file
to deallocate the storage, rather than the person who did the
unlinking. Note that with GFS2, those two events might take place
on different nodes.

Also there are a number of other changes:

 o We use the Linux inode subsystem as it was intended to be
used, wrt allocating GFS2 inodes
 o The Linux inode cache is now the point which we use for
local enforcement of only holding one copy of the inode in
core at once (previous to this we used the glock layer).
 o We no longer use the unlinked "special" file. We just ignore it
completely. This makes unlinking more efficient.
 o We now use the 4th block allocation state. The previously unused
state is used to track unlinked but still open inodes.
 o gfs2_inoded is no longer needed
 o Several fields are now no longer needed (and removed) from the in
core struct gfs2_inode
 o Several fields are no longer needed (and removed) from the in core
superblock

There are a number of future possible optimisations and clean ups
which have been made possible by this patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-06-14 15:32:57 -04:00
Steven Whitehouse 3a8a9a1034 [GFS2] Update copyright date to 2006
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18 15:09:15 -04:00
Steven Whitehouse bd8968010a [GFS2] Remove semaphore.h from C files
We no longer use semaphores, everything has been converted to
mutex or rwsem, so we don't need to include this header any more.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-18 14:54:58 -04:00
Steven Whitehouse 9801f6461e [GFS2] Remove incorrect initialisation of gh_owner
The gh_owner field shouldn't be set or reset outside the glock code.
These were left over from when recursive locking was allowed. It
isn't any more, so they are not needed.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-12 14:06:02 -04:00
Robert S Peterson 5bb76af1e0 [GFS2] Set d_ops for root inode
Well, I managed to track down the bug in gfs2 that was causing
my grief.  Below is a patch for the problem.  Please incorporate
as you see fit.  Or should I say: as you see git.

The problem was basically that you never set d_ops for the root
inode, so the wrong hash algorithm was being used.  But only for
the root directory.  Turns out that if I used subdirectories, it
used the proper hash and my files were found just fine.

Signed-off-by: Robert S Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-05 16:29:50 -04:00
Steven Whitehouse 579b78a43b [GFS2] Remove GL_NEVER_RECURSE flag
There is no point in keeping this flag since recursion is not
now allowed for any glock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-26 14:58:26 -04:00
David Teigland c63e31c2cc [GFS2] journal recovery patch
This is one of the changes related to journal recovery I mentioned a
couple weeks ago.  We can get into a situation where there are only
readonly nodes currently mounting the fs, but there are journals that need
to be recovered.  Since the readonly nodes can't recover journals, the
next rw mounter needs to go through and check all journals and recover any
that are dirty (i.e. what the first node to mount the fs does).  This rw
mounter needs to skip the journals held by the existing readonly nodes.
Skipping those journals amounts to using the TRY flag on the journal locks
so acquiring the lock of a journal held by a readonly node will fail
instead of blocking indefinately.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-20 17:03:48 -04:00
Steven Whitehouse b09e593d79 [GFS2] Fix a ref count bug and other clean ups
This fixes a ref count bug that sometimes showed up a umount time
(causing it to hang) but it otherwise mostly harmless. At the same
time there are some clean ups including making the log operations
structures const, moving a memory allocation so that its not done
in the fast path of checking to see if there is an outstanding
transaction related to a particular glock.

Removes the sd_log_wrap varaible which was updated, but never actually
used anywhere. Updates the gfs2 ioctl() to run without the kernel lock
(which it never needed anyway). Removes the "invalidate inodes" loop
from GFS2's put_super routine. This is done in kill super anyway so
we don't need to do it here. The loop was also bogus in that if there
are any inodes "stuck" at this point its a bug and we need to know
about it rather than hide it by hanging forever.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-07 11:17:32 -04:00
Steven Whitehouse 484adff8a0 [GFS2] Update locking in log.c
Replace the lock_for_trans()/lock_for_flush() functions with an rwsem.
In fact the sd_log_flush_lock becomes an rwsem (the write part of it)
and is extended slightly to cover everything that the lock_for_flush()
used to cover. The read part of the lock is instead of lock_for_trans().

This corrects the races in the original code and reduces the code size.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-29 09:12:12 -05:00
Steven Whitehouse 71b86f562b [GFS2] Further updates to dir and logging code
This reduces the size of the directory code by about 3k and gets
readdir() to use the functions which were introduced in the previous
directory code update.

Two memory allocations are merged into one. Eliminates zeroing of some
buffers which were never used before they were initialised by
other data.

There is still scope for further improvement in the directory code.

On the logging side, a hand created mutex has been replaced by a
standard Linux mutex in the log allocation code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-28 14:14:04 -05:00
Steven Whitehouse c752666c17 [GFS2] Fix bug in directory code and tidy up
Due to a typo, the dir leaf split operation was (for the first
split in a directory) writing the new hash vaules at the
wrong offset. This is now fixed.

Also some other tidy ups are included:

 - We use GFS2's hash function for dentries (see ops_dentry.c) so that
   we don't have to keep recalculating the hash values.
 - A lot of common code is eliminated between the various directory
   lookup routines.
 - Better error checking on directory lookup (previously different
   routines checked for different errors)
 - The leaf split operation has a couple of redundant operations
   removed from it, so it should be faster.

There is still further scope for further clean ups in the directory
code, and readdir in particular could do with slimming down a bit.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-20 12:30:04 -05:00
Steven Whitehouse 419c93e0b6 [GFS2] Add gfs2meta filesystem
In order to separate out the filesystem's metadata from "normal"
files and directories, a new filesystem type has been created.
It is called gfs2meta and mounting it gives access to the files
that were previously under .gfs2_admin (well still are until mkfs
is altered, which is next on the adgenda).

Its not currently possible to mount both gfs2 and gfs2meta on the
same block device at the same time. A future patch will allow that
to happen.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-02 16:33:41 -05:00
Steven Whitehouse c9fd43078f [GFS2] Tidy up mount code.
We no longer lookup ".gfs2_admin" in the root directory in order to
find it, but instead use the inode number given in the superblock.
Both the root directory and the admin directory are now looked up using
the same routine, so the redundant code is removed.

Also, there is no longer a reference to the root inode in the
GFS2 super block. When required this can be retreived via
sb->s_root->d_inode instead.

Assuming that we introduce a metadata filesystem type for GFS, then
this is a first step towards that goal.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-03-01 15:31:02 -05:00
Steven Whitehouse 5c676f6d35 [GFS2] Macros removal in gfs2.h
As suggested by Pekka Enberg <penberg@cs.helsinki.fi>.

The DIV_RU macro is renamed DIV_ROUND_UP and and moved to kernel.h
The other macros are gone from gfs2.h as (although not requested
by Pekka Enberg) are a number of included header file which are now
included individually. The inode number comparison function is
now an inline function.

The DT2IF and IF2DT may be addressed in a future patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27 17:23:27 -05:00
Steven Whitehouse 568f4c9659 [GFS2] 80 Column audit of GFS2
Requested by:
Prarit Bhargava <prarit@redhat.com>

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27 12:00:42 -05:00
Steven Whitehouse d92a8d4808 [GFS2] Audit printk and kmalloc
All printk calls now have KERN_ set where required and a couple of
kmalloc(), memset(.., 0, ...) calls changed to kzalloc().

This is in response to comments from:
Pekka Enberg <penberg@cs.helsinki.fi> and
Eric Sesterhenn <snakebyte@gmx.de>

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-27 10:57:14 -05:00
Steven Whitehouse f55ab26a8f [GFS2] Use mutices rather than semaphores
As well as a number of minor bug fixes, this patch changes GFS
to use mutices rather than semaphores. This results in better
information in case there are any locking problems.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-21 12:51:39 +00:00
Steven Whitehouse 7359a19cc7 [GFS2] Fix for root inode ref count bug
Umount is now working correctly again. The bug was due to
not getting an extra ref count when mounting the fs. We
should have bumped it by two (once for the internal pointer
to the root inode from the super block and once for the
inode hanging off the dcache entry for root).

Also this patch tidys up the code dealing with looking up
and creating inodes. We now pass Linux inodes (with gfs2_inodes
attached) rather than the other way around and this reduces code
duplication in various places.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-02-13 12:27:43 +00:00
Steven Whitehouse f42faf4fa4 [GFS2] Add gfs2_internal_read()
Add the new external read function. Its temporarily in jdata.c
even though the protoype is in ops_file.h - this will change
shortly. The current implementation will change to a page cache
one when that happens.

In order to effect the above changes, the various internal inodes
now have Linux inodes attached to them. We keep the references to
the Linux inodes, rather than the gfs2_inodes in the super block.

In order to get everything to work correctly I've had to reorder
the init sequence on mount (which I should probably have done
earlier when .gfs2_admin was made visible).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-30 18:34:10 +00:00
David Teigland b3b94faa5f [GFS2] The core of GFS2
This patch contains all the core files for GFS2.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-01-16 16:50:04 +00:00