diff --git a/Documentation/crc32.txt b/Documentation/crc32.txt
index a08a7dd9d625..8a6860f33b4e 100644
--- a/Documentation/crc32.txt
+++ b/Documentation/crc32.txt
@@ -1,4 +1,6 @@
-A brief CRC tutorial.
+=================================
+brief tutorial on CRC computation
+=================================
 
 A CRC is a long-division remainder.  You add the CRC to the message,
 and the whole thing (message+CRC) is a multiple of the given
@@ -8,7 +10,8 @@ remainder computed on the message+CRC is 0.  This latter approach
 is used by a lot of hardware implementations, and is why so many
 protocols put the end-of-frame flag after the CRC.
 
-It's actually the same long division you learned in school, except that
+It's actually the same long division you learned in school, except that:
+
 - We're working in binary, so the digits are only 0 and 1, and
 - When dividing polynomials, there are no carries.  Rather than add and
   subtract, we just xor.  Thus, we tend to get a bit sloppy about
@@ -40,11 +43,12 @@ throw the quotient bit away, but subtract the appropriate multiple of
 the polynomial from the remainder and we're back to where we started,
 ready to process the next bit.
 
-A big-endian CRC written this way would be coded like:
-for (i = 0; i < input_bits; i++) {
-	multiple = remainder & 0x80000000 ? CRCPOLY : 0;
-	remainder = (remainder << 1 | next_input_bit()) ^ multiple;
-}
+A big-endian CRC written this way would be coded like::
+
+	for (i = 0; i < input_bits; i++) {
+		multiple = remainder & 0x80000000 ? CRCPOLY : 0;
+		remainder = (remainder << 1 | next_input_bit()) ^ multiple;
+	}
 
 Notice how, to get at bit 32 of the shifted remainder, we look
 at bit 31 of the remainder *before* shifting it.
@@ -54,25 +58,26 @@ the remainder don't actually affect any decision-making until
 32 bits later.  Thus, the first 32 cycles of this are pretty boring.
 Also, to add the CRC to a message, we need a 32-bit-long hole for it at
 the end, so we have to add 32 extra cycles shifting in zeros at the
-end of every message,
+end of every message.
 
 These details lead to a standard trick: rearrange merging in the
 next_input_bit() until the moment it's needed.  Then the first 32 cycles
 can be precomputed, and merging in the final 32 zero bits to make room
-for the CRC can be skipped entirely.  This changes the code to:
+for the CRC can be skipped entirely.  This changes the code to::
 
-for (i = 0; i < input_bits; i++) {
-	remainder ^= next_input_bit() << 31;
-	multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
-	remainder = (remainder << 1) ^ multiple;
-}
+	for (i = 0; i < input_bits; i++) {
+		remainder ^= next_input_bit() << 31;
+		multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
+		remainder = (remainder << 1) ^ multiple;
+	}
 
-With this optimization, the little-endian code is particularly simple:
-for (i = 0; i < input_bits; i++) {
-	remainder ^= next_input_bit();
-	multiple = (remainder & 1) ? CRCPOLY : 0;
-	remainder = (remainder >> 1) ^ multiple;
-}
+With this optimization, the little-endian code is particularly simple::
+
+	for (i = 0; i < input_bits; i++) {
+		remainder ^= next_input_bit();
+		multiple = (remainder & 1) ? CRCPOLY : 0;
+		remainder = (remainder >> 1) ^ multiple;
+	}
 
 The most significant coefficient of the remainder polynomial is stored
 in the least significant bit of the binary "remainder" variable.
@@ -81,23 +86,25 @@ be bit-reversed) and next_input_bit().
 
 As long as next_input_bit is returning the bits in a sensible order, we don't
 *have* to wait until the last possible moment to merge in additional bits.
-We can do it 8 bits at a time rather than 1 bit at a time:
-for (i = 0; i < input_bytes; i++) {
-	remainder ^= next_input_byte() << 24;
-	for (j = 0; j < 8; j++) {
-		multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
-		remainder = (remainder << 1) ^ multiple;
-	}
-}
+We can do it 8 bits at a time rather than 1 bit at a time::
 
-Or in little-endian:
-for (i = 0; i < input_bytes; i++) {
-	remainder ^= next_input_byte();
-	for (j = 0; j < 8; j++) {
-		multiple = (remainder & 1) ? CRCPOLY : 0;
-		remainder = (remainder >> 1) ^ multiple;
+	for (i = 0; i < input_bytes; i++) {
+		remainder ^= next_input_byte() << 24;
+		for (j = 0; j < 8; j++) {
+			multiple = (remainder & 0x80000000) ? CRCPOLY : 0;
+			remainder = (remainder << 1) ^ multiple;
+		}
+	}
+
+Or in little-endian::
+
+	for (i = 0; i < input_bytes; i++) {
+		remainder ^= next_input_byte();
+		for (j = 0; j < 8; j++) {
+			multiple = (remainder & 1) ? CRCPOLY : 0;
+			remainder = (remainder >> 1) ^ multiple;
+		}
 	}
-}
 
 If the input is a multiple of 32 bits, you can even XOR in a 32-bit
 word at a time and increase the inner loop count to 32.