Job TreeNavigate the job tree to view your child job details
Loading job tree...
Reliable customizable text translation supporting 200+ languages through complex tokenization and sentence splitting.
Code
ready
Outputs
waiting for outputs
Logs
listening for logs...
README

Text Translation

Sieve's Translate app is designed for developers seeking an easy-to-use interface for text translations. It uses a combination of sentence and word tokenization approaches on top of a translation model to do reliable translations on 200+ languages, even for long documents. Under the hood, it uses the OpenAI GPT-4o, GPT-4o-mini and Seamless Communication model as backends.

Key Features

  • Wide Language Support: Translate supports over 200+ languages with the gpt-4o and gpt-4o-mini backends. seamless backend supports over 99+ languages.
  • Language styles: You can specify language styles such as "informal french", "shakespearean english", "brazilian portuguese", etc (only available with gpt-4o and gpt-4o-mini backends).
  • Safe words: Specify safe words that you don't want to translate such as product names or locations.
  • Translation Dictionary: Customize translations by specifying mappings for specific words or phrases to control the output translations.
  • Batch Processing: You can pass in a serialized list to the text param using json.dumps(list) to process multiple sentences at once.
  • Track Costs: Use return_bill to keep track of each job's processing time, token usage and costs.
  • Unlimited Context Window: Translate any length of text without worrying about any limits!

Pricing

When using this application, you will by-default be billed at-cost off the selected backend, along with a small processing fee.

Using your own API keys

Optionally, If you have your own API keys for OpenAI, you can enter them in our Secrets section to be billed at your own rate for those specific backends.

Backend$ / 1M Input tokens$ / 1M Output tokens$/hrSecret Name
GPT-4o$5.00$15.00$0.40OPENAI_API_KEY
GPT-4o-mini$0.15$0.60$0.40OPENAI_API_KEY
Seamless--$1.24-

Note: GPT-4o and GPT-4o-mini are billed based on the number of tokens used + the processing cost. A token is roughly equal to 3/4 of a word. You can calculate the number of tokens in your text using Sieve's tiktoken app. The processing cost is the time taken for the app to generate the translation. The seamless backend is only billed on the processing time taken, as it is a model hosted on Sieve.

Example: Let's Translate an Entire Book

As an example, let's translate Mary Shelly's Frankenstein to Spanish using Sieve's translate API. The book contains 77,883 words which map to 94,420 input tokens. The processing time is around 2 minutes and there are 124,981 output tokens. Assuming we are using GPT-4o-mini backend, here's how much it will cost us:

input_tokens = 94,420
output_tokens = 124,981
token_cost = (94,420 tokens * $0.15 + 124,981 tokens * $0.60) / 1M tokens =   $0.0891516
processing_time = 2 minutes * $0.40 / 60 minutes = $0.01
total_cost = 0.089$ + 0.01$ = $0.099

We have just translated an entire book for only $0.099!

You can check out our Spanish translation of Frankenstein here - it took only 1 minute, 53 seconds!

Seamless Languages

Seamless support 99 total languages. Here are the languages you can use:

  • en (English)
  • zh (Chinese)
  • de (German)
  • es (Spanish)
  • ru (Russian)
  • ko (Korean)
  • fr (French)
  • ja (Japanese)
  • pt (Portuguese)
  • tr (Turkish)
  • pl (Polish)
  • ca (Catalan)
  • nl (Dutch)
  • ar (Arabic)
  • sv (Swedish)
  • it (Italian)
  • id (Indonesian)
  • hi (Hindi)
  • fi (Finnish)
  • vi (Vietnamese)
  • he (Hebrew)
  • uk (Ukrainian)
  • el (Greek)
  • ms (Malay)
  • cs (Czech)
  • ro (Romanian)
  • da (Danish)
  • hu (Hungarian)
  • ta (Tamil)
  • no (Norwegian)
  • th (Thai)
  • ur (Urdu)
  • hr (Croatian)
  • bg (Bulgarian)
  • lt (Lithuanian)
  • la (Latin)
  • mi (Maori)
  • ml (Malayalam)
  • cy (Welsh)
  • sk (Slovak)
  • te (Telugu)
  • fa (Persian)
  • lv (Latvian)
  • bn (Bengali)
  • sr (Serbian)
  • az (Azerbaijani)
  • sl (Slovenian)
  • kn (Kannada)
  • et (Estonian)
  • mk (Macedonian)
  • br (Breton)
  • eu (Basque)
  • is (Icelandic)
  • hy (Armenian)
  • ne (Nepali)
  • mn (Mongolian)
  • bs (Bosnian)
  • kk (Kazakh)
  • sq (Albanian)
  • sw (Swahili)
  • gl (Galician)
  • mr (Marathi)
  • pa (Punjabi)
  • si (Sinhala)
  • km (Khmer)
  • sn (Shona)
  • yo (Yoruba)
  • so (Somali)
  • af (Afrikaans)
  • oc (Occitan)
  • ka (Georgian)
  • be (Belarusian)
  • tg (Tajik)
  • sd (Sindhi)
  • gu (Gujarati)
  • am (Amharic)
  • yi (Yiddish)
  • lo (Lao)
  • uz (Uzbek)
  • fo (Faroese)
  • ps (Pashto)
  • tk (Turkmen)
  • nn (Nynorsk)
  • mt (Maltese)
  • sa (Sanskrit)
  • lb (Luxembourgish)
  • my (Myanmar)
  • bo (Tibetan)
  • tl (Tagalog)
  • mg (Malagasy)
  • as (Assamese)
  • tt (Tatar)
  • haw (Hawaiian)
  • ln (Lingala)
  • ha (Hausa)
  • ba (Bashkir)
  • jw (Javanese)
  • su (Sundanese)
  • yue (Cantonese)
  • my (Burmese)
  • ca (Valencian)
  • nl (Flemish)
  • ht (Haitian)
  • lb (Letzeburgesch)
  • ps (Pushto)
  • pa (Panjabi)
  • ro (Moldavian)
  • si (Sinhalese)
  • es (Castilian)
  • zh (Mandarin)