Arabic Chat Translator Could Transform Social Media Analysis
Basis Technology Corp.’s Rosette Chat Translator converts the Arabic chat alphabet — which uses English characters and numbers to represent the language — into standard written form. CEO Carl Hoffman said the software is designed specifically for intelligence agencies and commercial enterprises.
Nearly 500 million people speak Arabic, the official language of more than 20 countries. Romanized Arabic chat, or Arabizi, often is used online in social media, blogs and chat rooms, as well as for cell phone text messages. To maintain secrecy, terrorists rely on these forms of communication to recruit, raise funds and plan attacks, experts say. But Arabizi has so many variations it can cause headaches for analysts looking for clues. Words are spelled phonetically and can vary depending on the dialect of the author. The phrase “Tell them” could be written “2ulluhom” by an Egyptian, “2illun” by a Lebanese or “Gullhom” in the Gulf dialect, according to promotional materials for the translator.
The new software uses a linguistic algorithm that looks at the frequency of the structural components of each word. It is based on a statistical model obtained from the input of millions of Internet users from Arab-speaking populations.
“We believe this product has important implications for government intelligence gathering and web search tools,” Hoffman said.