Duolingo: Learn A Language While Translating The
Web
Currently, there are over 2 billion internet users. Only 26.8% of these users
are English speakers, yet 55.9% of the web content is English. This alienates
the 1,657,347,866 people that do not speak English and therefore cannot use
these sites. Machine translation is not yet good enough to translate the sites
automatically. If you look at translations by a computer, you see that most of
the time it is incorrect. Also, how would you be able to identify the few times
that the machine translation is correct? Consequently, the best solution right
now is to have people complete these translations.
Duolingo is a recent project started at CMU by Professor Luis Von
Ahn and his team that uses crowd-sourcing for text-translation. Their goal is
to use people to translate the web. Von Ahn saw two problems with translating
the web: there is a lack of bilinguals to do all this work, and there also
needs to be motivation for people to do this for free. The team thought of a
solution that solves both of these problems. The project would translate the
web through education. Duolingo is a way for people to learn a new language.
Users are provided with sentences based on their level of proficiency in a
language, which they would then have to translate. I will discuss more details
of how Duolingo works later.
First, I want to talk about why Von Ahn and his team think that
this project has incredible potential to be successful. To translate all of
Wikipedia into Spanish would cost about $50 million using professional
translators. This feat of translating the web is something that needs to be
done by people volunteering their time and have no other costs for it to work.
Von Ahn’s previous project was ReCaptcha. This project used the same idea of
crowd-sourcing and no other costs in order to have people digitize books as
they have to enter Captchas for certain websites.
Figure 1 |
(If you do not know what a Captcha is, it is displayed on
the right as Figure 1. It is the distorted text images that users have to type
to prove that they are not a computer. Captcha is often used when you are
trying to register for a specific website. Or another example is when you are
buying tickets from Ticketmaster to prevent scalpers from taking advantage of
the system.) The ReCaptcha technology uses work that people already have to do
and adds value to it. There are many books that are not in good enough
conditions for computers to be able to identify the text; whereas humans can
still read the words and type what the text is. By combining many responses,
ReCaptcha can create fairly accurate digital versions of the text.
This ReCaptcha project has had billions of users and has been
extremely successful. Von Ahn hopes that Duolingo will follow this same success
and once again have billions of users. Number wise, there are approximately 1.2
billion people that want to learn a foreign language each year. The language
software can cost up to $500, making it biased against people that cannot
afford it. Duolingo is a free alternative to the software that can attract all
of these people. It is similar to ReCaptcha in the sense that it is taking
something that people already do, in this case, attempt to learn a new
language, and adding more value to it by applying the translations to real
content and translating the web.
Now, if we think back to the task of translating Wikipedia into
Spanish, with 100,000 users Duolingo would be able to complete this task in
just 5 weeks. With 1 million users, this can be done in just 80 hours. The
question then becomes how accurate would these translations be? If beginners
are translating websites to a language that they have never seen before, their
sentences can be entirely incorrect. In testing, the translations that
Duolingo creates are as good as that of professionals without sacrificing
speed. This is because it combines many of the users’ translations and has many
learning tools.
Next, I want to look more specifically into how Duolingo works.
Pretend that you are a user of Duolingo, and I will walk you through the
website. First you are provided with a sentence that fits your language
level. Then, you will have the option to see the context that this sentence was
taken from. Since all users are translating real content, it makes it more
interesting and encourages you to keep learning. It is a practical application
of language skills.
Figure 2 |
When looking at the specific sentence, you are able to hover over
a word if you would like to see other users’ translations for help, illustrated
in Figure 2 on the left. Then once you submit a translation, you are given
feedback. Duolingo will either congratulate you for being correct, or indicate
what is wrong with your translation. For example, if there was a simple typo
that it could detect (ex. typing “epro” instead of “pero”), Duolingo will say
that your translation was correct, but inform you of the typo. It will also
indicate whether there were more significant errors (ex. using a masculine
adjective instead of a feminine). Duolingo will also help you understand and
memorize the words you do not know and have hovered over with educational
examples. You can then vote on the quality of other translations displayed,
providing the site with feedback over other users’ efforts.
This feedback feature I just mentioned is also one of the
motivating factors for users. Most language learning methods outside of a class
cannot give you feedback on your language skills. A workbook cannot tell you
that your answer is incorrect when there are so many translations of the same
sentences. Other software usually provides feedback for multiple choice
questions or single words. According to Von Ahn, feedback is crucial to
learning, and Duolingo attempts to provide feedback on entire sentences. Of
course, as mentioned there are multiple translations for the same sentences and
therefore makes this feedback a very difficult task.
I have discussed some of the background and intentions of
Duolingo, but not as much about its results in reality. This is because
Duolingo was only privately launched in November of 2011. By January, 45,000
sentences were translated from this private launch, giving a lot of hope for
the success of Duolingo in the future. The Spanish and German language options
were released in a beta version in March. There is an extremely large waiting
list of people that want to receive accounts for the beta version.
There are some blogs from the people that have already been able
to obtain these beta version accounts. From what I have researched, people have
found the user interface and layout of the lessons to be “flawless and visually
appealing” (Classical Bookworm). Regarding the educational functions, the
lessons and translations earn you points that allow you to progress to further
levels.
Figure 3 |
A layout of the lessons is illustrated in Figure 3 on the right. If you
make too many mistakes, you are forced to redo the lesson. Duolingo has audio
features, images for people that learn better visually, and even social
features where you can ask questions for others to answer. Overall, the
feedback for Duolingo has been very positive. Duolingo has great potential to impact the web and already has
a lot of user interest; I cannot wait to see what the future has in store.
To learn more, watch the Duolingo intro video below, or visit the official site: http://duolingo.com/
Works Cited
"Classical
Bookworm: Duolingo: First Impressions." Classical Bookworm.
Web.
<http://classical-bookworm.blogspot.com/2012/01/duolingo-first-impressions.html>.
"Duolingo
| Learn English, Spanish and German for Free." Duolingo. Web.
<http://duolingo.com/>.
Luis
Von Ahn -- Duolingo: The Next Chapter in Human Computation. TEDXCMU, 25 Apr. 2011.
Web. <http://www.youtube.com/watch?v=cQl6jUjFjp4>.
"Top
Ten Internet Languages - World Internet Statistics." Internet
World Stats. Web. <http://www.internetworldstats.com/stats7.htm>.
"Usage
of Content Languages for Websites." Usage Statistics of Content
Languages for Websites, April 2012. Web.
<http://w3techs.com/technologies/overview/content_language/all>.
"World
Internet Usage Statistics News and World PopulationStats." Internet
World Stats. Web. <http://www.internetworldstats.com/stats.htm>.
No comments:
Post a Comment