Re: Fwd: At MIT, they can put words in our mouths

From: Grant Callaghan (
Date: Wed May 15 2002 - 16:11:52 BST

  • Next message: Steve Drew: "RE: Memetic Influence on Evolution"

    Received: by id QAA10206 (8.6.9/5.3[ref] for from; Wed, 15 May 2002 16:18:11 +0100
    X-Originating-IP: []
    From: "Grant Callaghan" <>
    Subject: Re: Fwd: At MIT, they can put words in our mouths
    Date: Wed, 15 May 2002 08:11:52 -0700
    Content-Type: text/plain; format=flowed
    Message-ID: <>
    X-OriginalArrivalTime: 15 May 2002 15:11:52.0704 (UTC) FILETIME=[D80F1800:01C1FC22]
    Precedence: bulk

    People who believe anything they see on television are already lost.
    Fooling them is a meaningless exercise.


    >Date: Wed, 15 May 2002 10:41:05 -0400
    >At MIT, they can put words in our mouths
    >By Gareth Cook, Globe Staff, 5/15/2002
    >CAMBRIDGE - Scientists at the Massachusetts Institute of Technology have
    >created the first realistic videos of people saying things they never said
    >- a scientific leap that raises unsettling questions about falsifying the
    >moving image.
    >In one demonstration, the researchers taped a woman speaking into a camera,
    >and then reprocessed the footage into a new video that showed her speaking
    >entirely new sentences, and even mouthing words to a song in Japanese, a
    >language she does not speak. The results were enough to fool viewers
    >consistently, the researchers report.
    >The technique's inventors say it could be used in video games and movie
    >special effects, perhaps reanimating Marilyn Monroe or other dead film
    >stars with new lines. It could also improve dubbed movies, a lucrative
    >global industry.
    >But scientists warn the technology will also provide a powerful new tool
    >for fraud and propaganda - and will eventually cast doubt on everything
    >from video surveillance to presidential addresses.
    >''This is really groundbreaking work,'' said Demetri Terzopoulos, a leading
    >specialist in facial animation who is a professor of computer science and
    >mathematics at New York University. But ''we are on a collision course with
    >ethics. If you can make people say things they didn't say, then potentially
    >all hell breaks loose.''
    >The researchers have already begun testing the technology on video of Ted
    >Koppel, anchor of ABC's ''Nightline,'' with the aim of dubbing a show in
    >Spanish, according to Tony F. Ezzat, the graduate student who heads the MIT
    >team. Yet as this and similar technology makes its way out of academic
    >laboratories, even the scientists involved see ways it could be misused: to
    >discredit political dissidents on television, to embarrass people with
    >fabricated video posted on the Web, or to illegally use trusted figures to
    >endorse products.
    >''There is a certain point at which you raise the level of distrust to
    >where it is hard to communicate through the medium,'' said Kathleen Hall
    >Jamieson, dean of the Annenberg School for Communication at the University
    >of Pennsylvania. ''There are people who still believe the moon landing was
    >Currently, the MIT method is limited: It works only on video of a person
    >facing a camera and not moving much, like a newscaster. The technique only
    >generates new video, not new audio.
    >But it should not be difficult to extend the discovery to work on a moving
    >head at any angle, according to Tomaso Poggio, a neuroscientist at the
    >McGovern Institute for Brain Research, who is on the MIT team and runs the
    >lab where the work is being done. And while state-of-the-art audio
    >simulations are not as convincing as the MIT software, that barrier is
    >likely to fall soon, researchers say.
    >''It is only a matter of time before somebody can get enough good video of
    >your face to have it do what they like,'' said Matthew Brand, a research
    >scientist at MERL, a Cambridge-based laboratory for Mitsubishi Electric.
    >For years, animators have used computer technology to put words in people's
    >mouths, as they do with the talking baby in CBS's ''Baby Bob'' - creating
    >effects believable enough for entertainment, but still noticeably
    >computer-generated. The MIT technology is the first that is
    >''video-realistic,'' the researchers say, meaning volunteers in a
    >laboratory test could not distinguish between real and synthesized clips.
    >And while current computer-animation techniques require an artist to smooth
    >out trouble spots by hand, the MIT method is almost entirely automated.
    >Previous work has focused on creating a virtual model of a person's mouth,
    >then using a computer to render digital images of it as it moves. But the
    >new software relies on an ingenious application of artificial intelligence
    >to teach a machine what a person looks like when talking.
    >Starting with between two and four minutes of video - the minimum needed
    >for the effect to work - the computer captures images which represent the
    >full range of motion of the mouth and surrounding areas, Ezzat said.
    >The computer is able to express any face as a combination of these faces
    >(46 in one example), the same way that any color can be represented by a
    >combination of red, green, and blue. The computer then goes through the
    >video, learning how a person expresses every sound, and how it moves from
    >one to the next.
    >Given a new sound, the computer can then generate an accurate picture of
    >the mouth area and virtually superimpose it on the person's face, according
    >to a paper describing the work. The researchers are scheduled to present
    >the paper in July at Siggraph, the world's top computer graphics
    >The effect is significantly more convincing than a previous effort, called
    >Video Rewrite, which recorded a huge number of small snippets of video and
    >then recombined them. Still, the new method only seems lifelike for a
    >sentence or two at a time, because over longer stretches, the speaker seems
    >to lack emotion.
    >MIT's Ezzat said that he would like to develop a more complex model that
    >would teach the computer to simulate basic emotions.
    >A specialist can still detect the video forgeries, but as the technology
    >improves, scientists predict that video authentication will become a
    >growing field - in the courts and elsewhere - just like the authentication
    >of photographs. As video, too, becomes malleable, a society increasingly
    >reliant on live satellite feeds and fiber optics will have to find even
    >more direct ways to communicate.
    >''We will probably have to revert to a method common in the Middle Ages,
    >which is eyewitness testimony,'' said the University of Pennsylvania's
    >Jamieson. ''And there is probably something healthy in that.''
    >Compare original and synthetic videos from MIT on
    >Gareth Cook can be reached at
    >This story ran on page A1 of the Boston Globe on 5/15/2002. Copyright
    >2002 Globe Newspaper Company.
    >==============================================================This was
    >distributed via the memetics list associated with the
    >Journal of Memetics - Evolutionary Models of Information Transmission
    >For information about the journal and the list (e.g. unsubscribing)


    The means you use shape the ends you get.

    Send and receive Hotmail on your mobile device:

    This was distributed via the memetics list associated with the
    Journal of Memetics - Evolutionary Models of Information Transmission
    For information about the journal and the list (e.g. unsubscribing)

    This archive was generated by hypermail 2b29 : Wed May 15 2002 - 16:30:02 BST