To: Ish who wrote (210191 ) 12/14/2001 7:05:54 PM From: Thomas A Watson Respond to of 769670 An essay in engineering. Technology Not Good Enough to Fake Bin Laden Tape, Say Experts By Jim Krane The Associated Press Published: Dec 14, 2001 NEW YORK (AP) - Even with the current state of digital wizardry, faking the videotape in which Osama bin Laden appears to take credit for the Sept. 11 attacks would be extremely difficult, experts said Friday. The biggest hurdle would be mimicking the cadence and rhythm of natural human speech. Synchronizing a doctored soundtrack with existing video would also be tough. And technology that can synthesize Arabic speech is still in its infancy. Chi-Lin Shih, a language modeling scientist at Lucent Technologies' Bell Labs, described the process as akin to reassembling a broken vase by gluing together its shards. Close scrutiny would likely reveal the cracks. Software tools allow for elements of a person's speech to be glued together to put words in their mouths - but such a doctored recording would not sound natural to an expert listener, said Kenneth Stevens, head of the speech research lab at the Massachusetts Institute of Technology. Some hardline Islamic militants in Pakistan and the Middle East suggest the tape was fabricated to provide a rationale for U.S. military actions in Afghanistan. President Bush on Friday called the charge "preposterous." Administration officials said they intentionally declined to try to enhance the video's sound or picture so as not to give detractors ammunition. Emerging speech synthesis technology is giving computers the ability to mimic a human voice. The creators of AT&T's Natural Voices software, for example, claim the program can mimic the speech of actors now dead, such as John Wayne. By allowing computers to analyze enough tapes of an actor's voice, the program could synthesize the voice, allowing it to make statements that Wayne never said. Theoretically, the same could be done with bin Laden's voice, since recordings of his speech are readily available, said Lynn Shepherd, a vice president of Fonix Corp., a speech synthesis software company in Salt Lake City. "If they had a lot of recordings of bin Laden, they could create some speech that sounded pretty good," Shepherd said. Most such engines require a dozen or more hours of high-quality studio recordings, where a speaker is asked to make all of a language's particular combinations of sounds. "It takes engineers months to break down all these voice fragments so that I can reproduce the language," said Bill DeStefanis, who heads speech technology for ScanSoft, Inc. of Peabody, Mass. "The idea that the U.S. government could have done this in the space of a month is highly improbable," DeStefanis said. "With a short snippet, I might be able to fake you out, but not a long speech." On the tape, some of bin Laden's words are unintelligible. The tape's poor sound quality could theoretically be used to mask tampering, experts said. But beyond synthesizing a voice, doctored speech would have to be synchronized video - another difficult task usually easy to spot. Digital synchronization of sound and images is a staple of Hollywood filmmaking. In the 2000 movie "Gladiator," actor Oliver Reed died before shooting ended and the filmmakers pieced together several scenes using previously shot footage. DeStefanis and others said, however, that fooling the trained eye is difficult. "The human eye and ear are very good at seeing out-of-synch lips," he said. Also, few if any of today's sophisticated speech synthesis engines have been programmed to generate bin Laden's native Arabic, said Shepherd. "There are some low quality ones, but nothing that would be good enough," Shepherd said. source: AP/breaking/ tom watson tosiwmee