This is pretty cool. I downloaded and have been having fun with the "analogy" generation using the small (and noisy) corpus and the bigger freebase model. It's pretty entertaining.
Gorilla:King Kong::Lizard:Godzilla (nice!)
Gorilla:King Kong::Dog:Lassie (ok, I guess in the "most famous" sense...)
Gorilla:King Kong::Cat:Kentucky Wildcats Mens Basketball (huh?)
It gets some other impressive ones like "image is to mirror as text is to exegesis".
But some of these are my favorites:
writing is to poetry as programming is to haskell
but
programming is to haskell as writing is to klingon
Other good ones:
writing is to novel as programming is to java
writing is to nonsense as programming is to lua
programming is to c as writing is to letters
It also does interesting things with subtraction:
china - communism =~ hong kong
marriage - love =~ divorce
sky - light =~ constellation
game - goal =~ role playing
usa - guns =~ canada
government - inheritance =~ democratically elected
comedy - laugh =~ television
computers - software =~ machines
fire - hot =~ weapons
ice - cold =~ rocks
river - water =~ valley
winter - cold =~ summer
culture - context =~ miscellaneous topics
software - evil =~ GNU GPL
GNU GPL - evil =~ LGPL
California - high tech =~ Florida
Florida - Disney =~ Missouri
wealth - reponsibility =~ Capitalist
world - oil =~ war
god - omnipotent =~ man
man - wisdom =~ young
sleep - rest =~ insomnia
wizard - magic =~ mad scientist
death - birth =~ work
It also get some nonsense (room-ceiling=prince, congress-senate=library,cake-sugar=steed, tax cuts-spending=woodchucks) but it's at least vaguely sensical even on a small dataset, most of the time.
Gorilla:King Kong::Lizard:Godzilla (nice!)
Gorilla:King Kong::Dog:Lassie (ok, I guess in the "most famous" sense...)
Gorilla:King Kong::Cat:Kentucky Wildcats Mens Basketball (huh?)
It gets some other impressive ones like "image is to mirror as text is to exegesis".
But some of these are my favorites:
writing is to poetry as programming is to haskell
but
programming is to haskell as writing is to klingon
Other good ones:
writing is to novel as programming is to java
writing is to nonsense as programming is to lua
programming is to c as writing is to letters
It also does interesting things with subtraction:
china - communism =~ hong kong
marriage - love =~ divorce
sky - light =~ constellation
game - goal =~ role playing
usa - guns =~ canada
government - inheritance =~ democratically elected
comedy - laugh =~ television
computers - software =~ machines
fire - hot =~ weapons
ice - cold =~ rocks
river - water =~ valley
winter - cold =~ summer
culture - context =~ miscellaneous topics
software - evil =~ GNU GPL
GNU GPL - evil =~ LGPL
California - high tech =~ Florida
Florida - Disney =~ Missouri
wealth - reponsibility =~ Capitalist
world - oil =~ war
god - omnipotent =~ man
man - wisdom =~ young
sleep - rest =~ insomnia
wizard - magic =~ mad scientist
death - birth =~ work
It also get some nonsense (room-ceiling=prince, congress-senate=library,cake-sugar=steed, tax cuts-spending=woodchucks) but it's at least vaguely sensical even on a small dataset, most of the time.
“Crown jewel” of natural language processing has been open sourced by Google #nlp
...prepackaged deep-learning software designed to understand the relationships between words with no human guidance. Just input a textual data set and let underlying predictive models get to work learning.
Google calls it “an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words.”
Deep learning, Howard explained, is essentially a bigger, badder take on the neural network models...
via +Ward Plunet cc\ +Doyle Groves
...prepackaged deep-learning software designed to understand the relationships between words with no human guidance. Just input a textual data set and let underlying predictive models get to work learning.
Google calls it “an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words.”
Deep learning, Howard explained, is essentially a bigger, badder take on the neural network models...
via +Ward Plunet cc\ +Doyle Groves
We're on the cusp of deep learning for the masses. You can thank Google later

deer+deer=elk
song+song=album
few+few=several
cold+cold=winter
roof+roof=underneath
air+air=helicopters
Some other interesting ones:
slavery+war=civil war
song+words=lyrics
caffeine+cola=carbonated
boat+underwater=submarine
data+understanding=knowledge
why+how=know
Finding utter nonsense with addition is harder, you tend to get one term dominating (eg, flying+dinosaur=tyrannosaurus rex) but you can get some amusing ones:
komodo dragon+ninja=yeti
gazpacho+internet=filking