This post is outdated.
We've made big improvements since this was published.
Check out what's new:

How we made state-of-the-art speech synthesis scalable with Modular