Thursday, December 22, 2011

Vocaloid2 - A First Look and Listen

According to the README, Vocaloid2 "has a cool new look with improved user functionality", which makes me very happy. Not the new interface. That doesn't make me happy at all. What makes me happy is knowing that I never experienced the old one. If this is an improvement then the old one must have been the interface equivalent of pepper spray being applied to your eyes with butter knives. I think it was designed for users with VGA monitors. Everything is cramped, tiny, and pointlessly shortened in eye-straining white on black. Remember computers in the 80s? It's like that only it has no excuse.

Upon startup, you can jump right in. The select tool is the one you start with, in case you're ready to select any or all of those zero notes. And yes, it uses the classic tool paradigm, with icons. Select arrow; phallus draw tool; mysterious slash; eraser; tic-tac-toe; and... something. One thing at a time! I think I'll spend a few minutes selecting notes. Then I'll draw some for a while. Perhaps later, after the sun has gone down, I'll delete some. Does nobody else use a mouse? You have a perfectly good multi-function tool right under your hand! I can understand some functions requiring a special tool - like whatever the hell that last one does - but select, draw and delete should all be doable using just the mouse. Unless you're Mozart, you will be doing all three of those things at the same time.

Okay, so I switched to the draw tool (because at this point select is completely useless) and I drew in some notes. They all default to "Ooh [U:]". Let's have a listen.

Wait for it...

You have to wait for it because another default is four beats of nothing before the something. Why? I have no idea. Just in case you need to spend some time not doing anything in any way useful, I guess. I like to think that they were inspired by those websites that have a Flash intro; the ones that make you wait a while for a link to go to something worthwhile. Of course, it doesn't show you that there is a pre-roll until you hit play the first time. Why let people get a hint that you're about to disappoint them?

That makes me wonder... Does this thing turn on webcams in the hopes of recording the confused faces of its users? Is it all some practical joke? I'd better moon the webcam, just in case.

Okay, so how does three notes of "Ooh [u:]" sound? Well, less like three notes of "Ooh [u:]" than I would expect*. The middle one sounds suspiciously like a hard e. That's kind of strange. Maybe I'll check the documentation to see if that's a planned thing, like "if you put three of the same sound in a row, the middle one will change to something else for a lark and the third one will sound like a mix of the two. You're welcome!" Ha ha! Just kidding. I'm not going to check the documentation.

Vocaloid2 - 1:Ooh Ooh Ooh by pough

Moving on, I'm going to try to write in some words. How about: "hot dog buns and hot dog weiners"? That's some yummy lyrics! I split up "weiners" into two notes and tried to spell it phonetically. The first thing I notice is that it looks like it defaults anything it can't "read" to a "[u:]" sound. That's fair, but what are my options? How can I fix things so it says "wee nurs" and not "wee ooh"? I would have hoped that a program relying on phonemes would have easy access to some kind of phoneme list or some other tool. It's rather important. Well, unless you're only wanting crazy-person variations of "ooh".

Vocaloid2 - 2:Hot dog buns and hot dog wee ooh by pough

So, let's have a look. Nothing in the right-click context menu... Oh, under the "Lyrics" menu I can see "Phoneme Transformation (T)" - looks like there's even a hotkey for it! Let's see what it does... Hmm. Nothing. Or, if it did something, it hid it well.

That's about all I can handle for now. It's late and I don't want to wrestle with a bad UI any more. Some might say its quirkiness is no more off-putting than the sounds it creates. I disagree. The sounds, while still odd and far from perfect, are pushing the limits. The interface combines a throw-back to the dark ages of computers and a methodology that makes me wonder if the developers have ever used a computer before. For example, you can apparently make alterations to the parameters of the voice, but to do that you have to completely shut down the software that makes the sounds and start up a completely different (silent) program. That means that you can't change the sound and hear the sound at the same time, nor can you change the sound when you are able to hear it.

This has to be a joke.

* Each note displays the word or sound that you hope it will make and, in square brackets, the incomprehensible phonemes that it actually will pronounce.

NOTE: Vocaloid is a product that can be bought per person. That is to say, you buy a singer. I bought Sonika because she seems to be the closest I could get to a Japanese pop singer voice in the English world (they have different singers in Japanese editions, likely because of differences in tastes as well as differences in phonemes) from Zero-G.

No comments:

Post a Comment