remarkable-linux/drivers/char
Egmont Koblinger 2f1a2ccb9c console UTF-8 fixes
The UTF-8 part of the vt driver suffers from the following issues which are
addressed in my patch:

1) If there's no glyph found for a particular valid UTF-8 character, we try
   to display U+FFFD. However if this one is not found either, here's what
   the current kernel does:

   - First, if the Unicode value is less than the number of glyphs, use the
     glyph directly from that position of the glyph table. While it may be a
     good idea in the 8-bit world, it has absolutely no sense with Unicode
     in mind. For example, if a Latin-2 font is loaded and an application
     prints U+00FB ("u with circumflex", not present in Latin-2) then as a
     fallback solution the glyph from the 0xFB position of the Latin-2
     fontset (which is an "u with double accent" - a different character) is
     displayed.

   - Second, if this fallback fails too, a simple ASCII question mark is
     printed, which is visually undistinguishable from a real question mark.

   I changed the code to skip the first step (except if in non-UTF-8 mode),
   and changed the second step to print the question mark with inverse color
   attributes, so it is visually clear that it's not a real question mark,
   and resembles more to the common glyph of U+FFFD.

2) The UTF-8 decoder is buggy in many ways:

   - Lone continuation bytes (section 3.1 of Markus Kuhn's UTF-8 stress
     test) are not caught, they are displayed as some "random" (taken
     directly form the font table, see above) glyphs instead the replacement
     character.

   - Incomplete sequences (sections 3.2 and 3.3 of the stress test) emit no
     replacement character, but rather cause the subsequent valid character
     to be displayed more times(!).

   - The decoder is not safe: overlong sequences are not caught currently,
     they are displayed as if these were valid representations. This may
     even have security impacts.

   - The decoder does not handle D800..DFFF and FFFE..FFFF specially, it
     just emits these code points and lets it be looked up in the glyph
     table. Since these are invalid code points, I replace them by U+FFFD
     and hence give no chance for them to be looked up in the glyph table.
     (Assuming no font ships glyphs for these code points, this change is
     not visible to the users since the glyph shown will be the same.)

   With my fixes to the decoder it now behaves exactly as Markus Kuhn's
   stress test recommends.

3) It has no concept of double-width (CJK) characters. It's way beyond the
   scope of my patch to try to display them, but at least I think it's
   important for the cursor to jump two positions when printing such
   characters, since this is what applications (such as text editors)
   expect. Currently the cursor only jumps one position, and hence
   applications suffer from displaying and refreshing problems, and editing
   some English letters that are preceded by some CJK characters in the same
   line is a nightmare. With my patch an additional space is inserted after
   the CJK character has been printed (which usually means a replacement
   symbol of course). (If U+FFFD isn't availble and hence an inverse
   question mark is displayed in the first cell, I keep the inverted state
   for the space in the 2nd column so it's quite easy to see that they are
   tied together.)

4) There is a small built-in table of zero-width spaces that are not to be
   printed but silently skipped. U+200A is included there, but it's not a
   zero-width character, so I remove it from there.

Signed-off-by: Egmont Koblinger <egmont@uhulinux.hu>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:12 -07:00
..
agp Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6 2007-05-05 14:55:20 -07:00
drm Merge branch 'drm-patches' of master.kernel.org:/pub/scm/linux/kernel/git/airlied/drm-2.6 2007-05-07 12:24:07 -07:00
hw_random Use stop_machine_run in the Intel RNG driver 2007-05-08 11:15:00 -07:00
ip2 Fix bogus 'inline' in drivers/char/ip2/i2lib.c 2007-02-21 11:18:26 -08:00
ipmi move die notifier handling to common code 2007-05-08 11:15:04 -07:00
mwave [PATCH] mwave: interesting flags savings 2007-02-20 17:10:14 -08:00
pcmcia PCI: Cleanup the includes of <linux/pci.h> 2007-05-02 19:02:35 -07:00
rio rio: typo in bitwise AND expression. 2007-02-17 18:57:09 +01:00
tpm tpm_infineon: add support for devices in mmio space 2007-05-08 11:15:02 -07:00
watchdog header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
.gitignore [MIPS] Remove IT8172-based platforms, ITE 8172G and Globespan IVR support. 2006-10-03 17:59:17 +01:00
amiserial.c [PATCH] CHAR-Amiserial: turn local_save_flags() + local_irq_disable() into local_irq_save() 2007-02-11 11:18:07 -08:00
apm-emulation.c [APM] Add shared version of APM emulation 2007-02-09 17:08:57 +00:00
applicom.c IRQ: Maintain regs pointer globally rather than passing to IRQ handlers 2006-10-05 15:10:12 +01:00
applicom.h
briq_panel.c Revert "[POWERPC] Rename get_property to of_get_property: drivers" 2007-04-26 22:24:31 +10:00
cd1865.h
ChangeLog
consolemap.c console UTF-8 fixes 2007-05-08 11:15:12 -07:00
cp437.uni
cs5535_gpio.c Char: cs5535_gpio, add MODULE_DEVICE_TABLE 2007-05-08 11:15:04 -07:00
cyclades.c cyclades: remove custom types 2007-05-08 11:15:03 -07:00
decserial.c [PATCH] dz: Fixes to make it work 2006-12-07 08:39:41 -08:00
defkeymap.c_shipped
defkeymap.map
digi1.h [PATCH] Clean up the old digi support and rescue it 2005-09-07 16:57:20 -07:00
digiFep1.h [PATCH] Clean up the old digi support and rescue it 2005-09-07 16:57:20 -07:00
digiPCI.h
ds1286.c [CHAR] ds1286: Fix handling of seconds in RTC_ALM_SET ioctl. 2007-03-08 01:10:30 +00:00
ds1302.c [PATCH] DS1302: local_irq_disable() is redundant after local_irq_save() 2007-02-12 09:48:30 -08:00
ds1620.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
dsp56k.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
dtlk.c dtlk: fix error checks in module_init() 2007-05-08 11:15:09 -07:00
ec3104_keyb.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
efirtc.c [PATCH] make more file_operation structs static 2006-07-03 15:26:59 -07:00
epca.c [PATCH] char/epca.c: remove unused function 2007-03-05 07:57:53 -08:00
epca.h [PATCH] char: kill unneeded memsets 2006-10-04 07:55:13 -07:00
epcaconfig.h
esp.c [PATCH] tty: switch to ktermios 2006-12-08 08:28:57 -08:00
generic_nvram.c [PATCH] mark struct file_operations const 3 2007-02-12 09:48:45 -08:00
generic_serial.c [PATCH] generic_serial: fix decoding of baud rate 2007-03-27 09:05:15 -07:00
genrtc.c WorkStruct: make allyesconfig 2006-11-22 14:57:56 +00:00
hangcheck-timer.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
hpet.c [PATCH] sysctl: remove insert_at_head from register_sysctl 2007-02-14 08:09:59 -08:00
hvc_beat.c [POWERPC] Celleb: hypervisor console driver 2007-02-07 14:03:21 +11:00
hvc_console.c drivers/char/hvc_console.c: cleanups 2007-05-08 11:14:59 -07:00
hvc_console.h [POWERPC] Make the hvc_console output buffer size settable 2006-07-13 18:53:32 +10:00
hvc_iseries.c [POWERPC] Rename get_property to of_get_property: partial drivers 2007-04-27 15:51:56 +10:00
hvc_rtas.c [POWERPC] Make the hvc_console output buffer size settable 2006-07-13 18:53:32 +10:00
hvc_vio.c [POWERPC] Rename get_property to of_get_property: partial drivers 2007-04-27 15:51:56 +10:00
hvcs.c [PATCH] tty: switch to ktermios 2006-12-08 08:28:57 -08:00
hvsi.c [POWERPC] Rename get_property to of_get_property: partial drivers 2007-04-27 15:51:56 +10:00
i8k.c i386: sched.h inclusion from module.h is baack 2007-05-08 11:15:08 -07:00
ip27-rtc.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
isicom.c [PATCH] Char: tty_wakeup cleanup 2007-02-11 10:51:26 -08:00
istallion.c [PATCH] Char: tty_wakeup cleanup 2007-02-11 10:51:26 -08:00
Kconfig Make /dev/port conditional on config symbol 2007-05-08 11:15:07 -07:00
keyboard.c SPIN_LOCK_UNLOCKED cleanup in drivers/char/keyboard 2007-05-08 11:15:11 -07:00
lcd.c [CHAR] lcd: Fix two warnings. 2007-03-17 01:03:26 +00:00
lcd.h [MIPS] Add MTD device support for Cobalt 2007-02-20 17:11:55 +00:00
lp.c ROUND_UP macro cleanup in drivers/char/lp.c 2007-05-08 11:15:08 -07:00
Makefile rename TANBAC TB0219 config 2007-05-07 12:13:04 -07:00
mbcs.c [PATCH] mark struct file_operations const 3 2007-02-12 09:48:45 -08:00
mbcs.h
mem.c Make /dev/port conditional on config symbol 2007-05-08 11:15:07 -07:00
misc.c [PATCH] Correct misc_register return code handling in several drivers 2006-12-07 08:39:35 -08:00
mmtimer.c [PATCH] Correct misc_register return code handling in several drivers 2006-12-07 08:39:35 -08:00
moxa.c [PATCH] Char: moxa, pci probing 2007-02-11 10:51:30 -08:00
mspec.c [PATCH] mark struct file_operations const 3 2007-02-12 09:48:45 -08:00
mxser.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
mxser.h [PATCH] mxser: remove ambiguous redefinition of INIT_WORK 2007-02-11 10:51:25 -08:00
mxser_new.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
mxser_new.h [PATCH] Char: mxser_new, upgrade to 1.9.15 2007-02-11 10:51:29 -08:00
n_hdlc.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
n_r3964.c [PATCH] Char: timers cleanup 2007-02-12 09:48:30 -08:00
n_tty.c [PATCH] tty: update the tty layer to work with struct pid 2007-02-12 09:48:32 -08:00
nsc_gpio.c [PATCH] struct path: convert char-drivers 2006-12-08 08:28:44 -08:00
nvram.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
nwbutton.c [PATCH] Char: timers cleanup 2007-02-12 09:48:30 -08:00
nwbutton.h IRQ: Maintain regs pointer globally rather than passing to IRQ handlers 2006-10-05 15:10:12 +01:00
nwflash.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
pc8736x_gpio.c [PATCH] drivers/char/pc8736x_gpio.c: remove unused static functions 2006-09-29 09:18:05 -07:00
ppdev.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
pty.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
random.c [NET]: random functions can use nsec resolution instead of usec 2007-04-25 22:28:25 -07:00
raw.c [PATCH] raw: don't allow the creation of a raw device with minor number 0 2007-02-11 10:51:34 -08:00
riscom8.c [PATCH] Char: tty_wakeup cleanup 2007-02-11 10:51:26 -08:00
riscom8.h
riscom8_reg.h
rocket.c Char: rocket, add MODULE_DEVICE_TABLE 2007-05-08 11:15:04 -07:00
rocket.h
rocket_int.h [PATCH] drivers/char/rocket.c: cleanups 2005-06-25 16:25:04 -07:00
rtc.c [PATCH] sysctl: remove insert_at_head from register_sysctl 2007-02-14 08:09:59 -08:00
scc.h [PATCH] m68k: static vs. extern in scc.h 2006-01-12 09:09:00 -08:00
scx200_gpio.c [PATCH] scx200_gpio export cleanups 2006-09-29 09:18:06 -07:00
selection.c [PATCH] tty locking on resize 2006-09-29 09:18:12 -07:00
ser_a2232.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
ser_a2232.h
ser_a2232fw.ax
ser_a2232fw.h
serial167.c [PATCH] Char: serial167, cleanup 2007-02-11 10:51:28 -08:00
snsc.c IRQ: Maintain regs pointer globally rather than passing to IRQ handlers 2006-10-05 15:10:12 +01:00
snsc.h [IA64-SGI] Handle SC env. powerdown events 2006-01-26 13:32:26 -08:00
snsc_event.c IRQ: Maintain regs pointer globally rather than passing to IRQ handlers 2006-10-05 15:10:12 +01:00
sonypi.c sonypi: use mutex instead of semaphore 2007-04-28 22:13:34 -04:00
specialix.c [PATCH] Char: timers cleanup 2007-02-12 09:48:30 -08:00
specialix_io8.h
stallion.c [PATCH] Char: stallion, use dynamic dev 2006-12-08 08:28:59 -08:00
sx.c [PATCH] sx: fix non-PCI build 2006-12-13 09:05:49 -08:00
sx.h [PATCH] Char: sx, request regions 2006-12-08 08:28:59 -08:00
sxboards.h
sxwindow.h
synclink.c drivers/char/synclink.c: check kmalloc() return value 2007-05-08 11:15:02 -07:00
synclink_gt.c [PATCH] Char: timers cleanup 2007-02-12 09:48:30 -08:00
synclinkmp.c [PATCH] Char: timers cleanup 2007-02-12 09:48:30 -08:00
sysrq.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
tb0219.c [PATCH] struct path: convert char-drivers 2006-12-08 08:28:44 -08:00
tipar.c layered parport code uses parport->dev 2007-05-08 11:15:05 -07:00
tlclk.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
toshiba.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
tty_io.c Protect tty drivers list with tty_mutex 2007-05-08 11:15:05 -07:00
tty_ioctl.c [PATCH] tty: improve encode_baud_rate logic 2007-02-11 10:51:32 -08:00
vc_screen.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
viocons.c [POWERPC] iSeries: fix viocons init 2006-12-20 16:37:48 +11:00
viotape.c [PATCH] mark struct file_operations const 3 2007-02-12 09:48:45 -08:00
vme_scc.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
vr41xx_giu.c [MIPS] Vr41xx: Fix after GENERIC_HARDIRQS_NO__DO_IRQ change 2007-01-23 18:26:47 +00:00
vt.c console UTF-8 fixes 2007-05-08 11:15:12 -07:00
vt_ioctl.c [PATCH] vt: fix potential race in VT_WAITACTIVE handler 2007-04-02 10:06:09 -07:00