Tatoeba Tips

Tatoeba began as the brain child of Trang, inspired by the English–Japanese website alc.co.jp. The name “Tatoeba” even comes from the Japanese word for “for example.” You can read more about the history of Tatoeba.org on Trang’s blog, but the long and short of it is: Tatoeba is a collection of open source, community-generated sentences in multiple languages—something like a huge, global phrasebook. These sentences can be a great resource in your language study. But Tatoeba can also be overwhelming at first, so here are some tips to get you started.

1. You should probably register.

The nature of Tatoeba is such that everyone can browse it and look up sentences; registering allows you to contribute translations, add your own sentences, and (eventually, if you decide to ask for such privileges) tag and link sentences. If you’re just curious about a word now and then, you probably don’t need to sign up. But if you want to dig deeper, you’ll need a proper account. (It’s free!)

Note that Tatoeba, unlike Lang-8, doesn’t make a clear distinction between your native language and the languages you’re studying; rather, you list any languages you can speak, and then rate your fluency in them, from “almost no knowledge” to “native level.” So go ahead and add everything you’re interested in and know about. Here are mine, for example:


There is no limit to how many languages you can have in your account, and there’s no fluency requirement, so add as many as you like. My Korean, for example, is in absolute shambles, but since I at least know how to read Hangul, I listed it (and then put it at level 0: “almost no knowledge”).

2. Learn to use the search function.

Tatoeba uses Sphinx Search to account for all of the complexities of language. It’s mostly intuitive, but there are some wrinkles to be aware of. You can learn more at the Tatoeba Wiki.

Sphinx Search relates to the search bar at the top of the page. This search focuses just on the content of sentences, looking for actual, literal words. If you’re interested in a particular category of words, such as sports or politics or weather, you can search the tags instead. This search function is much less complex and does not use the same operators as Sphinx Search.

3. Add sentences.

If you want to improve Tatoeba (and of course you do, right?) and you have the time, you can also add sentences of your own. There are two ways to do that.

First, you can simply add a sentence directly to the corpus. Tatoeba even helpfully suggests vocabulary that hasn’t yet been featured on the site, so you can maximize your helpfulness by focusing specifically on those words.

The other way you can add sentences is by translating sentences already in the corpus.

When you’re looking at sentences on Tatoeba, you’ll see a little symbol in the upper left corner of every sentence, like this:


This is the option to translate. It’s not necessary (and even, arguably, flat-out unhelpful) to give a translation that’s identical to what’s already on the site. (Alas, there’s also no upvoting/approval system like there is on Lang-8, so there’s no good way to tell if a given translation is good or bad.) But if you look at sentence and see that it doesn’t have a translation in a language you know well, or the other translations are awkward or inadequate, then you can feel free to add one! When you click that symbol, a little box comes up:

I’m not actually brave enough to try to translate this into English. Yet.

Tatoeba also uses indirect (from L1 to L3 by way of L2) translations. It distinguishes between direct and indirect translations with blue arrows (indicating direct translations) and gray arrows (indirect translations). But you have to be careful: if you decide to translate something indirectly, make sure you click the translation you’re working from first. This will take you to a new page where that L2 translation is the “main sentence,” rather than just a translation. That way, your L3 translation is appropriately marked on the original L1 sentence as an indirect translation, and the code stays neat and tidy. (You can read more about Trang’s philosophy here.)

The principle of translating on Tatoeba is sentences and meaning, rather than word-for-word correspondence. “I’m 25 years old” is not, technically, a word-for-word translation of French (“I have 25 years.”) or Russian (“To me there are 25 years.”), but it’s how native speakers would express the idea of being 25 years old, so it’s the best (and only) possible choice.

4. Submit high-quality work.

Tatoeba is not a playground, or an opportunity for feedback/error correction. When you submit a translation or a sentence, you are submitting study material for other learners to use. This is why Tatoeba stresses that you only add translations and sentences only in languages in which you have fairly high levels of competency. Anything else—grammar or vocabulary practice, writing practice, proofreading—is better saved for elsewhere, such as Lang-8.

That’s Tatoeba in a nutshell! I’d like to shout out to my friend Yousef, who was the first to alert me to the existence of Tatoeba. It’s a great project but a little overwhelming, so if you need help (or if I missed anything), comment below or let me know on Twitter!


Leave a Reply

Your email address will not be published. Required fields are marked *