Saturday, October 30, 2010

Let's Outsource Reading

Dear Friends,

I love to read.   You love to read.  Why not?  A novel is great experience.  You live in it.  You feel it.  It becomes you.  You become its author or characters or mood.  The words in the book become the words that you come to define your life with and the very nature of your life changes because of the semantic meaning embedding in everything you see or do.

Sometimes though, we are forced to read against our will.  We are made to extract meaning from texts we have no intention of loving.  Worse yet, even if we love the books we are obligated to read, how are we realistically expected to emerge from it's reading with some objective idea of its plot, its language and its moral when its plot is endlessly assaulted by our subjective memories of life and media representing life, it's language assaulted by our idea of language and its moral assaulted by our generally-firmly-established moral compasses?

Any of us in high school or college know the pain of being forced to read novels we would never have read.  Worse still many teachers in the "liberal arts" don't grade based on execution but based on the size of the intersection of their ideology and the student's.  Now if you can cull a few key phrases from your teacher you can interpret a book according to their ideology without every having to abandon your own!

Those of us in the world of  non-matriculated adults wish we could feign a knowledge of books to impress coworkers, possible romantic mates, etc.   The opportunity cost of a book is roughly a week's salary, nearly $1,000 if you are the average American.  Would you like 40 hours and/or $1,000?

There are also those of you who love to read but for whom reading isn't enough.  You read a book five times and then you read all the existing criticism of it yet you still wish you could decipher the nexus of this book more.  Well my friend, I can help you apply the same exactitude Mechanical Engineers or Investment Bankers apply to their respective crafts with a cornucopium of tools you never imagined could be applied to literature.

A quick demonstration:  Let's say you had to read "A Tale of 2 Cities" by Charles Dickens.  Why you just paste the url of a website containing the book's text into my prebuilt function like so:

Please tell me the website containing the text you wish to analyze:
http://www.gutenberg.org/files/98/98-h/98-h.htm


Having pasted in the address of Project Gutenberg's free copy of the book the program outputs a series of facts with which one could easily create arguments for an essay with perhaps just the assistance of the book's description on the back.  There is no doubt that the aid of just the introduction or CliffNotes would be more than enough to proceed with this output to an incredible and insightful essay on the even the graduate level.

A few interesting characteristics:

The first output is that of collocations, or simply the most common two-word combinations of words in the book.  This is a great tool for figuring out main characters and combined with a little Google can really help you decipher character development in a cinch.  Also some of the non Proper Noun combos can be extremely telling of the plot.

The 100 most common two word sets in the book are: 
Building collocations list
Mr. Lorry; Miss Pross; Madame Defarge; Mr. Cruncher; Saint Antoine; Doctor Manette; said Mr.; Charles Darnay; young lady; Mr. Stryver; Mr. Lorry.; Miss Manette;Jacques Three; Sydney Carton; Young Jerry; Miss Pross.; Old Bailey; said Miss; Gutenberg-tm electronic;  n't know; one another; honest tradesman; electronic works;Archive Foundation; Mr. Darnay; Mr. Barsad; Monsieur Defarge; Temple Bar; golden hair; thousand seven; Mrs. Cruncher; seven hundred; Jarvis Lorry; Monsieur Gabelle; Mr. Carton; long time; Mr. Attorney-General; Good day; last night; Doctor Manette.; Madame Defarge.; electronic work; set forth; blue cap; Tellson 's.; North Tower

The limit to this function is your hunger for two-word combos.  I have set it to 100 for demonstration purposes.  You could set it to 1 or 1000.

 Let's move on to some other significant output:


Words used in similar contexts to monstrous are 


Building word-context index...
big blameless certainly complete early easily exactly familiar far
final freely good heavily here ignorant impracticable late litter
little long


Here you can toss a word in and see all the other words in the text that appear in a similar syntactic position of the sentence to that of your word and follow or are followed by the same or similar words.  This tool can be very useful for breaking down an author's style, or perhaps even analyzing the speeches of a political leader or CEO.  Applying it to Pride and Prejudice for example, we find the word 'monstrous'  for Jane Austen had the same sense as 'very' or 'extremely'.   Please note that 'monstrous' was a demonstration pick and you can use any word of your choice.


The lexical diversity score of the book is:  11  

The lexical diversity is the total number of words divided by the number of unique words in the text, the smaller the number the greater variety used by the author.  Nearly no books have scores under 8 or 9 (it would be hard to get a book out saying "the" just once so you can consider that range of "Slightly Below Ten" as extremely diverse.  Other great ways to apply it are to compare one author to others from their period or the book in question of the author with his other works.  Perhaps the "young" books of the author are filled with more lexical diversity as his/her primary concern was dazzling the world with his/her incredible and perplexing descriptions.  Perhaps the "old" books of the author exhibit low lexical diversity because now plot, character development and meaning have grown to take center stage in the work and now words are used like motifs in symphony, repeating throughout the text so as to create a definitive belief system without every explicitly stating it.  These claims are big, but with quantitatively sound lexical diversity measures you can at least dare to make them without your teacher responding with his "instant-write-off-ability" powers.

Now imagine you're reading "Great Expectations" but you saw your teacher at the video store once with an Obama shirt on and a copy of "Steal This Movie: The Life of Abbie Hoffman" in his hand, walking by his desk once you saw a thick tome of Karl Marx's "Kapital" and his hybrid auto has a Dennis Kucinich bumper sticker.  Let's be real.  Unless he is an incredibly detached and objective person - fairly rare in this world - writing a paper that interprets GreatEx from the perspective that Adam Smith's Invisible Hand and Reagan's Supply-Side Economics is not going to get you an A.

Instead you can develop a thesis dealing with Dickens' Marxist side manifesting itself through uses of the words "poverty" and "justice" throughout the work.  But how to know them all without reading it?  Boom:


The word injustice occurs in these contexts:
Building index...
Displaying 8 of 8 matches:
ock at my Lord Chief justice himself , and pulled
ge of the Lord Chief justice in the Court of King
s , the Tribunals of justice , and all society ( 
believe it. I do you justice ; I believe it. " Hi
 love of Heaven , of justice , of generosity , of
er of death , to his justice , honour , and good 
 love of Heaven , of justice , of generosity , of
 mind to impeach the justice of the Republic. She
None 


the word poverty occurs in these contexts: 
Displaying 7 of 7 matches:
ed the air , even if poverty and deprivation had 
abandoned her in her poverty for evermore , with 
ave repented them in poverty and obscurity often 
ations as their bare poverty yielded , from their
 in their children , poverty , nakedness , hunger
ss of twenty , whose poverty and obscurity could 
n the south country. poverty parted us , and she 




Naturally you can modify the amount of the words surrounding your search-word, asking for just a handful of them to get a few stylistic details or thousands at a time to get the whole page of excerpt featuring the search-word in question.

What about all those sources you need to quote?  Building a critical background for your essay is as easy as applying to the same search method to the Journal Articles you've thrown in your bibliography.  Do a quick run-through on "poverty"/"justice" in a handful of articles and you'll have the framework of an critical literary essay worthy of Harper's without ever having read the book nor the criticism you're quoting!

In the starter kit program I have provided these few key operations you can perform on a text.  I know hundreds more. Their output is not simply in the form of sentences and word lists but even graphs and charts or useful functions like search.  You can plot the fifty most used words in a text, do pie charts of word ratios to each other, or comb through thousands of pages with an algorithm much faster than the 'command f' you've been living by.

This program with the output above (legitimately enough for an essay writing) is on sale for only eight dollars.  I can also give you hour training/support sessions for 50$.  We can do as many such sessions are you desire to learn or my knowledge on Natural Language Processing provides.    Or you can just pick some of the aforementioned other methods and I will add them to the code for you just 4 dollars each (price negotiable as number of other methods passes 10).  Your choice.  

We let computers free of us childish basic arithmetic, algebra, geometry and statistics years ago and we  proceeded to an incredible degree of understanding of the maths and sciences.  Who would really believe then that we can't eliminate some of the more shallow reading functions and come away with more profound knowledge of texts?

Tuesday, September 21, 2010

Death to Poets

Having loved writing indecipherable, surrealistic poems as a youth and remembering how little effort was really needed in their composition, I decided recently to write a poetry virus of sorts that would automatically generate such texts.

The Algorithm is more or less this:

1) Generate Text (This is the easiest.  Using the .generate() method of the Natural Language Toolkit  for Python you can generate texts of any desired length.  They are n-gram generated, so that words will be followed by words based on the percentage that they follow them in real life.  Hence, if "whale" follows "the" 30% of the time in Moby Dick, "whale" ought be generated after 30% of "the"s in your generation.  NLTK comes with a series of readily manipulable parsed and tagged texts but you can go off and parse/tag one of your own if you think this exercise would be more exciting with a Kerouac or Phillip K. Dick novel instead of one of the 8 or 9 they supply you with.)  

2) Cut into Lines
    A) Create Monte Carlo number generator n<=14
    B) Cut to n words

3) White Spaces
    A) Create Monte Carlos number generator n < 6
    B) Assign n tab distribution across screen

4) Save to Word

5) E-mail to Poetry Publishing Sites


I was sincerely ready to create the whole system and become the most-published poet in history in a couple of days but I thought in doing so I was creating a huge disservice to people like Rimbaud and Whitman et al.  

At any rate, here's one of my Cyber-Poems, "Ahab"

 Moby Dick was 
                           fairly sighted 
from the hills .-- But the truth , the
very buttons of his bodily woes , 
           but by some experienced whaleman .
                                                                            After many similar hair - breadth escapes , -we ' ll be ready for it in two!- 
                                             -- that brawny doer of rejoicing good deeds  was wholly ignorant
of the horizon , like a wild set of mariners. 

But how fair ?
Fair for death ; 
how he lords it over the world!     
                       , so that when several ships are but subtile deceits , 

not only to be no hearts above the
                                                  common sperm whale ' s turn .

 the Leviathan that crooked serpent ; and nailed to her
                                                            highness a prodigious sensation in all his persecutions ; bethinking
it -- now over the head ,           
                                     and therefore
to ye , ye mates , seeking repose within six inches of his Captain
                                                                                                to mind the regular features 
of his Dutch whale fleet to be
                                           susceptible to atmospheric distension and contraction . If ye see ,
that thinking after all was caused by an awful question . 

................
It's certainly not a good poem, I realize that.  What makes me proud though is that I'd throw it in about 85th percentile of modern surrealist poems in terms of quality.  So, I beg you to ask yourself, if surrealist poems are generally terrible anyways, instead of trying to improve them, why not just automate the process of their composition?  They will continue to be horrible, only now humans can spend more time writing prose, studying engineering and uploading YouTube videos.....

Sunday, September 19, 2010

Augmented Reality @ Google Zeitgeist

Augmented Reality Plus a Couple of Other Great 7-10 Minute Speeches from the Geniuses over at Google Zeitgeist.


Watch minutes 3:40 to 19:59 to hear Maarten Lens-Fitzgerald of Layar.  Also immediately download Layar if you have an Android and/or >=iPhone 3GS.


Andreas Dengel on Text 2.0 from 19.60 to 26.18 is a delight as well.

Saturday, September 18, 2010

Employee of the Month in the Era of Augmented Reality

The Alpha Male/Female/GenderAmbiguous of the 21st century business is about to undergo a revolution.  While much of the Social Networking Web 2.0 stuff hitting the office is causing true revolutions in synergy and productivity it doesn't do much to impact the current Machiavellian orders of office politics.  Knowing the right names, talking to people the right way, passive aggressiveness and chess-like strategy are still essential.   All these current (and longstanding) real-life processes just start to take up so much more typing, Facetime, Cisco technology and bandwidth.  

Politics will always matter. I'm not going to stand on a podium and tell you that just because we are undergoing a revolution in man/machine interface that suddenly "who has the best ideas/can execute ideas of others" is going to trump who's the "coolest guy to have a beer with/the hot-smart-aggressive girl".   The possibility of that trumping occurring though, is going to increase significantly.

We all remember PreCalculus and how we learned that we can add two functions, say f(x) and g(x).  Well, success in business, particularly when business means working under others in a business hierarchy as it so often means, is largely the result of two functions, where f(x) is your ability to execute short and long term objectives and g(x) how much you please those around you/who your family is/how cheery you are/how much you understand the dynamics of team play.  

These functions are then weighted, based on the nature of your business.  Businesses like IBM, Microsoft, Google, Zynga and Facebook obviously are looking for a higher weighted f(x) execution-pool than Wynn Resorts, a host of mutual funds, and nearly every Mom-And-Pop store on Earth, whom are weighing more g(x).  

Of course we live in a world that elected George W. Bush twice.  People loathe Goldman Sachs CEO Lloyd Blankfein for quantifying and avoiding obvious risk the world was willingly blind not to see.  The Great Steve Jobs got fired from Apple.  No office, sector or environment, big or small, tech-centric, sales-centric, marketing-centric, long-tail, short-tail or state-sponsored is putting more emphasis on f(x) than g(x).  Let's go nuts and say the global aggregate weight average is something like .20(f(x)) + .8(g(x)).  (Caveat Emptor: The United States probably is a lot more execution-centric than the global function as in places like Latin America and Eurasia you can often become a huge success without producing anything of value at all).  

Well Augmented Reality is about to drastically change the weights.  Augmented Reality, the great democratizer of the Social Web World.  The hotshot at the office will still be the guy with assiduously combed hair who gets on well with all the female associates and the CFO's nephew will still pool incredible projects out of non-effort magically at deadline time.  We will still be human beings with human tastes and preferences, after all.  Only now they're going to have to share the spotlight with two new archetypes, (really three as a sections of both archetypes share a large Venn diagram-style intersection), the Passionate Video-Game Enthusiast and the Fanatic Mathematics Hobbyist.

Both of these types have always been tremendous executers and have gotten on fairly well in office environments.  They're scattered across all departments from Operations to Sales although their mecca seems to be IT.  Their f(x) has always been so high that it's overridden their poor g(x) complement.  Until now though, f(x) has not generated returns valid enough to warrant increasing its weight.  Also, due to the vertical nature of information and the still-persistent Machiavellian environment of the office, their poor g(x) actually hinders their potential f(x) as they never quite know where to turn on large floors or who to talk to about what and their interlocutors consistently prize diffusing the immediate situation of their contact more than finding an optimal solution to their mutual problem.  

Augmented Reality promises to change all that.  I present to you David, my theoretical protagonist.  Obsessed as a youth with Rubix cubes, rabidly happy in his undergraduate Logic and Algorithms classes, Sim City trained to calculate derivatives on spot to specific amortizable time-oriented objectives he's gotten a job as an Associate to a Wealth Advisor (aka Broker with Rich Friends last century) at the Private Bank  of some anonymous large New York Bank.  

Although David seems to execute trades across various financial instruments for various clients over various distributions of accounts with ease and preciseness he's looked-on a bit as an oddball amongst the varied 60 or so Associates on the floor and although his productivity is in the range of the top eight Associates, he's generally valued by the floor managers and HR folks as much closer to the median.    Points against him include irrelevant topics like, "He's not that fun to talk to in the elevator" and "He eats lunch alone playing with his iPad".  Although nobody physically says it, they all find it spooky how it subscribes to the Financial Times to read it, not use it as cute fashion accoutrement.  As an intern rotating advisor to advisor, he'd almost been ruled out after the first couple of weeks due to his odd laconic conversational style and public school background.  Only his ability to calculate all kinds of Bach-like variations on options and dual-currency notes had saved him.  

Well in in the spring of 2012 the IT department declares that they've finally set up a great AR system and a really nonintrusive set of normal-looking glasses you can implement it with.  Suddenly you can run through clients doing all kinds of great search variations and even invent your own search algorithms, doing menial tasks like paying off American Express Cards for rich families you only want around to sell huge portions of Latin american debt to can be automated with a few simple precise if/then statements and while loops.  The employee phonebook gives you key personal talk-time information about each of your contacts as all of them have been asked to fill out a survey about personal tastes and gustos so while you work out a revolving loan with Ralph from Credit for a a family of Scottish architects currently living in a high-risk country in South America you are prompted to bring up the Dodgers (his favorite team) or Frank Sinatra (his favorite singer) whenever you reach information bottlenecks too large to keep talking about the business-task-subject while too small to warrant calling back to resolve later.  

David takes to the new changes like a fish in water.  Even Ralph later comments to Peggy from HR that he thinks "my man lil' Davey" is finally getting some serious action in the bedroom because talking to him is like talking to a new man.  Only possible explanation for such a huge, great change, claims Ralph.  Peggy makes a mental note on with virtual sticky widget placed in the upper right hand corner of her vision-field to investigate later. 

David starts to come up with great ideas.  He automates client-alert processes on great investments by creating a machine-written newsletter that advises on three-week trading trends on ETFs tracking major indices powered by a semantic parser that combs Reuters and AP newlines for key sentiment-phrases.  When he does trades with foreign language clients he uses an automatic translation system that would cost him significant time on account of constraints in live-time translation due to the ambiguity of phrases if he wasn't really just using it to mask the fact that the image of him in the telecommunication feed is just an earlier video he made smilingly making mouth motions on loop.  In reality he's juggling a loan to a real estate mogul in Honk Kong, selling a derivative that shorts the S&P 500 3x to a graphic designer in Colombia and getting a Bengali Fishing Company Magnate to load up on GOOG shares before they release their new Google Maps Integration Suit that lets you physically explore 3D simulations of foreign continents and have business conferences on top of Mount Rushmore.  

Carol, his arch-nemesis is climbing the corporate sentiment ranks by networking at parties and bars all over midtown Manhattan and Upper East Side.  He's quickly catching her by networking with data structures and algorithms central to daily and/or high-return processes.  

Soon, John Wu, the head honcho of the floor has a talk with Irene Bellanotte, the Wealth Advisor that David has been working under for the last 18 months.  

"He's just too valuable Irene.  I know he's made you a lot of money and you have compensated him justly with the increasing returns but the potential he has to automate the office and train the other Associates is just too great.  This isn't coming from me.  It's coming from the men upstairs.  They've created a new position for him, in fact, given his multitasking prowess they've given him two.  He's Vice President of AR integration on the floor full-time  and will consult as Automation Architect throughout the company periodically."

David is but an isolated case.  The consequences could be much greater.  Great adapters, the kind of people who speak two languages, play three instruments and delight at taking new, even lower-paying jobs occasionally  for the sole purpose of challenging their ability to change their skill set will become like East Hampton's Timeshares for great companies.  Hewlett-Packard, Dell, Apple and Asus will share some great Operations expert three months a year, cut up to the smallest distributions possible where said great employee can effectively execute given that he has the ability to instantly comb and manipulate gigantic portions of information (as opposed to data, what he would have to comb through and with difficulty attempt to manipulate today).  

Pilots and Soliders will have Cartesian planes laid over their eyes and manipulable targets that move based on complex environmental-factor modeling software.  Children will have tutors with graphical demonstrations and trial problems that respond immediately to key issues they cannot grasp.  Lonely people will have lovers.  Husbands and wives will work on separate continents without considering theirs a long-term relationship.  You will never be lost in another city for as long as you live.  When storage becomes sufficiently cheap, and storage is becoming cheaper faster than anything else in technology,  you will be able to videorecord your entire life and spot search to remember forgotten phrases, faces, names and passwords.

Social Networks will revolutionize the idea of "being social" in so as far as changing the scale and possibilities of what is considered "being social."  Augmented Reality will change the very parameters with which we define "being social".  Social Networks are changing the domain and resultant range of our function but Augmented Reality will change the formula of the function itself.  In the real world of love and entertainment we'll adapt as easily and thoughtlessly as we have to any great upheaval, but in the too-big-to-fail world of large corporations born of the 20th century where Homo Economicus  meets Vito Corleone and everything is fair if you do it with a cheerful attitude things are going to be flipped on their ears.   If the business environment was a TV show, today it'd probably be the "The Office" but in the world of Augmented Reality it will be some new show that feels like a mix of equal parts "The Matrix", "Inception", and "Tron."

Virtual Reality was a fantasy of long-dead losers with no imagination and no glimpse of the tech specs.  The real revolution is Augmented Reality.  Rappers is the 90s were fond of the phrase "Get yo' head right."  Soon Kanye West will have a song climbing the iTunes download charts called "Get Yo' Headset Right".  

Top Fifty Business Books

Whip out your Kindle, iPad, Nook or Streak and start reading Bloomberg's Top Fifty Business Books of the Year.

Augmented Reality on Bloomberg Innovators

Not only did Bloomberg surprisingly hire MTV news alum Tabitha Soren, they did a great special on Augmented Reality on Innovators.

Bloomberg Special on AR

Augmented Reality is the technology that will be nascent during our current late Smartphone/early Tablet phrase, but will truly come to beautiful fruition during the late Tablet/early Headset-Glasses Stage.

This is a longshot call not 100% related to the post, but I want it on record that I believe the first commercially viable Augmented Reality headset will be a post-iPad product from Apple called iGlasses.