Projects / Sanzang


Sanzang is a compact and simple cross-platform machine translation system. It is especially useful for translating from the CJK languages (Chinese, Japanese, and Korean), and it is very suitable for working with ancient and otherwise difficult texts. Unlike most other machine translation systems, Sanzang is small and approachable. Any user can develop his or her own translation rules, and these rules are simply stored in a text file and applied at runtime.

Operating Systems

RSS Last announcement

Sanzang on the Web 08 Nov 2013

You can now use a Web interface to Sanzang to try its translation method. This is called "Sanzang on the Web," and it is configured to use an early...

RSS Recent releases

  •  04 Mar 2014 12:01

Release Notes: The vocab building code was updated for more efficient term matching. The TextFormatter class was refactored into a Formatting module. Methods were added for merging translation tables into one another. The Sanzang module "requires" was consolidated into a central location.

  •  13 Feb 2014 16:54

Release Notes: This release cleans the translation table initialization code to be faster, cleaner, and simpler, adds an RDoc option to set the documentation encoding to UTF-8 for RDoc 3.x, so the documentation will build properly (including when installed as a gem), and adjusts the example and test translation tables to not use leading spaces and other deprecated table formatting.

  •  29 Jan 2014 14:21

Release Notes: Horizontal space formatting has been updated so spaces will never be added to the end of a line. Horizontal spacing code has also been updated to be more robust. A transcoding bug was also fixed in Sanzang::Translator#translate_io, which would be triggered if using Sanzang internals as a library, calling the method with file paths as the arguments, and using an encoding other than UTF-8.

  •  27 Jan 2014 23:39

Release Notes: This is a minor release containing a new feature but maintaining backward compatibility. The Sanzang translation method has been updated to automatically handle horizontal spacing between translated terms. This means that translation tables no longer need to have extra spacing as part of their format.

  •  29 Dec 2013 13:50

Release Notes: This is a bugfix release to primarily resolve issues with internal transcoding between UTF-8 and other encodings. Additionally, since JRuby encoding support is limited, Sanzang on JRuby now uses UTF-8 by default.


Project Spotlight


A log file navigator.


Project Spotlight


A basic image difference viewer.