Thursday, September 24, 2009

Speech recognition for language learning

Last week I took a two-day mini course on how to use the Japanese open source spoken language recognition system Julius.

It can do both online and offline speech recognition, even over a network, out of the box. For example, if I say "かわいいね!” into the microphone, the words are printed to the screen. Neat, huh?

However, compatible acoustic models (basically, a mapping of wave signals to linguistic phonemes) only exist for Japanese. Meaning I can't do any English speech recognition with it.

So, what could I, a Japanese language beginner possibly do with such a system?

One idea is to jump on the iPhone development wagon and use it for language learning. Here was the scenario from this evening:

Kenzo heads to the 100 yen store to buy some dish detergent. He goes to the register to pay, placing his 105 yen on the counter. The clerk asks him something in Japanese. Kenzo flails around, says "euhhh... eto....", trying to convey "Je ne comprends pas, je ne parle pas japonais!"

Finally the clerk gives up and hands him his dish detergent and receipt. Kenzo walks out with his merchandise, confused. Repeat, every day he buys something from the store.

So, what did the clerk say? After a number of awkward experiences, Kenzo figures he's being asked if he needs plastic bag or not. But sometimes he says "いいえ” (No), and the guy gives him a bag anyway! What's going on?

If only his iPhone could transcribe what the clerk was saying, so Kenzo could parse out the words in written form. It's always a lot easier written down, right?

Well, here's what the clerk said:

"シールでよろしいですか?” - shiiru de yoroshii desu ka

Meaning "Is a seal ok?", referring to the little sticker they put on items in lieu of a plastic bag.

... Man, why'd I buy this Japanese cell phone with TV when I could've bought an iPhone. Bah.

Tuesday, September 22, 2009

Installing OpenCV on Mac OS X, with Python Bindings

It's my first time developing on OS X, so I thought I'd document the process to get started using vision algorithms on a Mac.

Platform: OS X Version 10.5.8

1. Install XCode from the Mac OS X install CD.

2. Download and install MacPorts, a command-line tool to install software and deal with dependencies for you. (For some reason I received a "Could not find specified message for index 16" error when trying to install the version for Snow Leopard, so I installed the version for Leopard instead.)

Don't forget to run sudo port selfupdate once it's installed.

3. Install OpenCV by running the command sudo port install open cv . This installs OpenCV 1.0, which is fine for me, because that's what the Python bindings (next) are tested on.

4. Download ctypes-opencv source and demo files. Once that's done, navigate to the ctypes-opencv directory and install using sudo python setup.py install

5. Open your ~/.profile (created when installing MacPorts) and add the following line:
export DYLD_FALLBACK_LIBRARY_PATH=/opt/local/lib

Save and close it, then restart your terminal window for the .profile to be loaded.

6. Try one of the ctypes-opencv demos to check if it works: python houghlines.py

Running the camshift demo, using the integrated MacBook iSight webcam:



That's me thinking "Oh, OpenCV installation. Can start my research now?"