Friday, September 21, 2007

Bite your thumb at text entry

Preamble - T9, Stylus, Thumb keyboards
There are many ways to enter text on a mobile phone - the best being T9 and its variants, by virtue of using the same set of 9 keys and their letters, and intelligently guessing the word you want as you type it.

Entering text using a stylus also has some competitors - MessagEase, Speedscript, Touch (better for long words, eg German), using a tiny onscreen QWERTY, Ring Writer and the highly impressive HexInput/QUONG and Shark/ATOMIK siblings. Most of these use the ability of a stylus-weilding user to make accurate strokes and shapes on the screen.

One arena suffers, though. Thumbs. Nokia's N800 (see 6:26) and the iPhone both offer a larger onscreen keyboard to mash with thumbs. Apple gets it fairly close with the iPhone keyboard, with the improvements given in the video linked above. Particularly impressive is the ouzza -> pizza recognition. It's this that I'm targeting with my thought.

Idea - Fuzzy matching
Suppose instead of delineated zones on any input device we define a fuzzier 'hotspot'. Thus instead of trying to guess what word the user is trying to type by looking at the sequence they hit (ouzza), we look at the fuzzier position that they hit (p-o, i-u,z,z,a). Ultimately, we look at the vector position of the tap, and look at a thumb radius around it. This should give us a weighted set of letters:
  • 40% P, 50% O, 10% L
  • 30% I, 70% U
  • 90% Z, 10% X
  • 90% Z, 10% X
  • 70% A, 20% Z, 10% S
This turns into a modified game of "My first is in... " to guess what the user is trying to type. There are other statistical games we can play as well. The user may systematically hit one side of P rather than the other, so we can drift the letter centre slightly (and show that to the user if we can leave the form-over-function land of Apple). This should produce a bias map across the keyboard, with some parts stretched and others compressed.

This would naturally yield a two-stage process - first undo the bias to get the de-biased coordinates, then try to match the vector path that the user has pressed with their thumbs to the words in the dictionary. This should yield a list of words, each with a confidence rating. Later incarnations could look at adjacent words, in the way that speech recognition does, to further improve the guess confidence. Then select the word with the highest confidence, or perhaps show the list of possible words to the user.

Sum
In the end, it should be a keyboard that you type on, which adapts to you, and to the fact that you're a human - you miss sometimes, you are usually trying to type words, and at speed you need some latitude and some help. So do error recognition and correction, and look at the actual geography of the keyboard to try to correct errors, or understand what is being typed. Let's use our brains, perhaps?

1 comment:

Phil H said...

http://arstechnica.com/journals/linux.ars/2007/09/10/nokia-opens-the-hildon-input-method-framework