Wednesday, January 20, 2010

5. Memento Mori


We already have the technology to enable a strange sort of eternal life in the form of a digital avatar which would look and talk like the deceased.

See Original


So in the context of a machine "following" our speech to translate it, and generally keeping track of what we do, how we do it, and what caused us to do it in the first place, I struck upon something quite perturbing. Many of the applications for this built up warehouse of knowledge are based on predicting how people would react to something hypothetically. This enables us to anticipate reactions, such as the enjoyment for movies and clothes and all sorts of things. Great stuff.

But…

Let's say I die tomorrow.

None of my profiles are going anywhere. Which means that if a new movie comes out next week, based on my profile we could see whether or not I would probably, hypothetically, enjoy the movie from beyond the grave.

So…

There's a couple. They're married for 30 year and during this time this device tracks their interactions on a daily basis. One day the husband dies. The wife takes his built up profile and turns it on. She sets it to hypothetical mode. She then says, “I had a bad day." The device runs through thirty years of data, looking for examples of when she had told him the same thing: over the phone, face to face, through text, etc. Thousands of times where she said pretty much the exact same phrase: "I really had a shitty day today." It then analyzes thirty years worth of his responses, based on their frequency of use, context, and tone. It comes up with, "I'm sorry baby, what happened?" as a suitable response in text form. It then passes this through a speech synthesizer [profile] it had created from listening to his voice for 30 years and suddenly she hears "I'm sorry baby, what happened?" in his actual voice.

The voice of her dead husband.

Responding as if he were alive.

But that's not all. It's also been watching his movements through her device's camera "eyes" this whole time. It watches how he eats, which hand he writes with, how his face reacts to certain phrases or news, how he turns when the doorbell rings, how often he sneezes, etc, statistically recording and tagging every single movement. She's stared at him long enough that her device has had enough time to put together a believable 3D image of the man. So now that he's gone it can actually render him in real time (seen through her display) sitting at the table eating breakfast with her. And as we just talked about, they can have a "realistic" [with twenty quotation marks] conversation.

With her dead husband.

That's what this system can do. Now for the hypothetical counter arguments and reasoning:

First of all, it's fake and unnatural as hell.

Agreed. And so is porn, but, believe it or not, a very small percentage of the population has been known to use that from time to time. Think about airbrushing in advertisements. Think about silicone breasts. Think about steroids, cod pieces, and hair pieces. Think about CGI in movies. Think about Neo fighting 50 computer generated animations of Agent Smith. It never looks perfect, but eventually it gets close enough that we can trick ourselves into enjoying it.

And using it for porn.

No matter how good this simulation gets, it won't be able to recreate someone's reactions perfectly.

That's right, but neither does a photograph perfectly recreate someone's visage, nor does a video perfectly recreate their movement, nor does a phone message perfectly recreate their voice. Still people look at photos of dead ones for various reasons. Just picture this system as a moving, interacting photo, which gives us an approximation of the way someone once was.

I'm reminded of Montaigne's introduction to his famous essays:

"[I am writing this book] for the pleasure of my relatives and friends so that, when they have lost me- which they soon must- they may recover some features of my character and disposition, and thus keep the memory they have of me more completely and vividly alive."

While pictures, mannerisms, characteristics, and thoughts are not the person, in and of themselves, they can be used as stepping stone toward the real thing. Montaigne's essays are not the man- rather they are a path he has left us which leads us closer to understanding him in his entirety. Naturally this destination is unattainable, as no one really knows each other ("Everyone dies a stranger"). Still, the journey is interesting and much is learned in the process.
So all these pieces of someone's profile would not make up the person any more than a book would. Rather they serve a dual function of remembrance for those who knew him in real life, and limited understanding for those who are born after his death. Personally at least, I would find it very fascinating to find out what my great grandfather's favorite songs were. It would also be nice to hear him tell his favorite joke and to see the reaction on his face when I tell him, say, that my sister is pregnant (formula runs search for every time someone told him that someone was pregnant. Analyzes his reactive facial movement, tone of voice, content of speech, volume, etc. Synthesize these things and there you have it). I'm left with a little piece of him, which helps me to imagine the whole person- a thought both comforting and disturbing for those of us who all must die eventually…

-----
Relevant Links and Updates:

Life Recorders May Be This Century’s Wrist Watch

by Michael Arrington on September 6, 2009

hfjgImagine a small device that you wear on a necklace that takes photos every few seconds of whatever is around you, and records sound all day long. It has GPS and the ability to wirelessly upload the data to the cloud, where everything is date/time and geo stamped and the sound files are automatically transcribed and indexed. Photos of people, of course, would be automatically identified and tagged as well.

Imagine an entire lifetime recorded and searchable. Imagine if you could scroll and search through the lives of your ancestors.

Would you wear that device? I think I would. I can imagine that advances in hardware and batteries will soon make these as small as you like. And I can see them becoming as ubiquitous as wrist watches were in the last century. I see them becoming customized fashion statements.

Privacy disaster? You betcha.

But ten years ago we would have been horrified by what we nonchalantly share on Facebook and Twitter every day. I always imagine what a family in the 70s would think about all of their photo albums being posted on computers and available for the entire world to see. They’d be horrified, they couldn’t even imagine it. Heck, a life recorder is less of a privacy abandonment step forward than we’ve already taken with the Internet and electronic surveillance in general.

A Business Week articlesfg talks about a ten year old Microsoft project called SenseCamrtyu (more herek) that is just such a device.

It’s clunky today and doesn’t do most of the things I mentioned in the first paragraph above. But a true life recorder that isn’t a fashion tragedy isn’t that far away.

In fact I’ve already spoken with one startup that has been working on a device like this for over a year now, and may go to market with it in 2010.

The hardware is actually not the biggest challenge. How it will be stored, transcribed, indexed and protected online is. It’s a massive amount of data that only a few companies (Microsoft, Google, Amazon) are equipped to really handle anytime soon.

But these devices are coming. And you have to decide if you’ll be one of the first or one of the last to use one.

Will you wear one? I will. Let us know in the poll below.

--------
Last call: Japanese tombs link up with cell phones
Mon Mar 24, 2008 3:12am EDT
http://www.reuters.com/article/lifestyleMolt/idUST32710120080324

TOKYO (Reuters Life!) - Bereaved Japanese will be able to keep in touch with their loved ones beyond the grave by using mobile phones to scan bar-coded tombstones and view photos and other information about the deceased.

In tech-savvy Japan, the square black-and-white codes are already widely used to load maps on to mobile phones, and are usually printed on business cards or restaurant brochures.

Ishinokoe, a Japanese tombstone maker, will place the codes behind lockable stone doors on the tomb so only relatives with a key can scan them.

The idea was to create a tomb that would not just be a site for storing the remains of a person, but a place to honor the deceased, the company said in a press release.

Using their mobile phone displays, relatives can post and view different items that reflect on the life of their departed loved one, such as holiday snapshots.

A sample Web site displayed one photo showing a man posing with his family on a boat, and another showing the same man and a woman in front of a cluster of skyscrapers (here).

The stones will go on sale next month and cost around 1 million yen ($10,010).

But those who neglect their filial obligations should be warned -- the code will also allow other relatives to see a list of people who have recently visited the grave.

----------

http://www.techcrunch.com/2009/07/20/iwise-is-twitter-for-dead-people/

deadtwit

At first glance, iWise is “Twitter for dead people,” says founder and CEO Edo Segal. You can find nuggets of wisdom from famous people about anything—love, change, happiness, truth. Then you can follow those people in your own “Wisdom Tree,” which is a feed of quotes from the people you follow. In my Wisdom Tree, for instance, I’m following Benjamin Franklin, George Orwell, Ernest Hemingway, the Dalai Lama, and Jim Morrison.

There is some integration with Twitter itself in that you can sign in using your Twitter account and Tweet out any particularly good quotes you want to share. When you search for a quote about a particular topic, iWise shows you results both from the quotes it indexes off the Web and Twitter. The results are presented in a flowing real-time stream, to give them a feeling of immediacy. You can also receive quotes in your Twitter feed once a day, but only as a private direct message. And there is even a free iPhone app (iTunes link), designed to give you a little bit of wisdom every day.





---------

Sunday, January 17, 2010

4. Translation: Digital Lingua Franca

See Original
“Any sufficiently advanced technology is indistinguishable from magic.”
-Arthur C. Clarke [RIP]


As this device becomes a reality it will solve some of our oldest problems: Automated, real time, spoken language translation is possible, today. [UPDATE]

From an essay I wrote in December 2008: “Using only free software and a steady internet connection, in half an hour I set up something which while on the surface appears as merely a clever exercise in futility, actually holds more social implications than most people seem to be aware of. The process is as follows: When I spoke into my computer's microphone input using a makeshift "mic", made out of the cheapest possible dollar-store headphones, into the freeware program "Wav To Text", what I said (carefully pronounced) would appear in real time as on-screen text. This was then copied and pasted into Google Translate, which would convert the English message into Spanish. This Spanish message was then copied and pasted into a simple online text-to-speech converter, and turned into sound- the results of which actually can be seen at this link: http://tts.imtranslator.net/2cwi
The process can be summarized as follows: English speech----> English text---(Translation)----> Spanish text ----> Spanish speech (albeit robotized). Observing it in action- automated by a (free) windows macro program- is actually not very impressive. It takes about 40 seconds and sometimes doesn't work. Initial results are impressive however, because my test phrase, "What are you doing here in New York?" was successfully converted into a HAL-like voiced version of "¿Qué estás haciendo, aquí en Nueva York?" (the equivalent Spanish phrase) without any human guidance. What is more impressive is the fact that this was done in perhaps the most "ghetto" way imaginable- all software was free, all websites are public, and the hardware was some of the cheapest and most readily available possible. This could be repeated by almost anyone, so long as they had access to a working computer and the internet. Imagine what could be accomplished if the amount of research and effort that was put into Google Translate itself was utilized on such a project.”


So we've proven that two people without a common tongue could hold a spoken conversation in almost real time (the process could probably be shaved down to a one second delay). To make this a little less impersonal however, it would be nice not to have to rely on an artificial/robotic voice. Enter this technology:

"When Prodigy's next album drops, it could debut in nearly 1,500 different languages without the rapper having to so much as crack a translation dictionary.The lyrics to "H.N.I.C. Part 2" will be translated using proprietary speech-conversion software developed by Voxonic...
Here's how the Voxonic translation process works. After translating the lyrics by hand, the text is rerecorded by a professional speaker in the selected language. Proprietary software is used to extract phonemes, or basic sounds, from Prodigy's original recording to create a voice model. The model is then applied to the spoken translation to produce the new lyrics in Prodigy's voice. A 10-minute sample is all we need to imprint his voice in Spanish, Italian or any language," said Deutsch... "
Now see it in action keeping in mind that he doesn't know a word of Spanish and he isn't singing. This is a machine simlation of his voice: (sorry, the video keeps getting deleted...)


So a more complex version of my makeshift translation center would utilize similar technology, taking the text version of what it wants to say and using the voice profile that my device has built up and stored through hours of previous use, in order to produce the translated speech (as opposed to the robot voice). The way this works is by analyzing the exact overtones that makes my voice unique, through Fourier analysis (again, it's all down to a formula). Therefore, the listener would actually hear, "Entonces, ¿qué estás haciendo aquí, en Nueva York?" in what sounds like my own voice- even thought I've never been able to speak Spanish! I speak in English, he hears Spanish- He responds in Spanish, and I hear English, ad absurdum. What social effects would this have if plugged into a video chat service like Skype? "Google Video" was just integrated into all Gmail accounts as of 12/2008, and besides GT, they've been working on speech recognition for some time:

"Today, the Google speech team (part of Google Research) is launching the Google Elections Video Search gadget, our modest contribution to the electoral process. With the help of our speech recognition technologies, videos from YouTube's Politicians channels are automatically transcribed from speech to text and indexed. Using the gadget you can search not only the titles and descriptions of the videos, but also their spoken content. Additionally, since speech recognition tells us exactly when words are spoken in the video, you can jump right to the most relevant parts of the videos you find. In addition to providing voters with election information, we also hope to find out more about how people use speech technology to search and consume videos, and to learn what works and what doesn't, to help us improve our products."

There are some skeptics: "H. Samy Alim, a professor of anthropology at the University of California at Los Angeles who specializes in global hip-hop culture and sociolinguistics, also doubted the newly minted songs would retain the clever wordplay and innovative rhyme schemes inherent in popular music. Besides, he laughed, "How do you translate 'fo shizzle' in a way that retains its creativity and humor for a global audience?"


While correct, it’s an oversimplification. "Fo Shizzle" wouldn't be translated, because certain things are better left alone and learned. I don't need to have "Habibi" translated into "baby" in order to understand the lyrics of Amr Diab…
Like I said though, real time translation is possible today- its just a matter of a little funding and cooperation.

So what would it mean if everyone on earth could understand one another?

Relevant links and unpdates:

http://research.microsoft.com/en-us/groups/speech/

A Trainable Text-to-Speech Synthesis

We developed a new, statistically trained, automatic text-to-speech (TTS) system. Unlike our previous, concatenation-based TTS, the new one includes these distinctive features: 1) a universal, maximum-likelihood criterion for model training and speech generation; 2) a relatively small training database, needing just about 500 sentences to train a decent voice font; 3) a small-footprint (less than 2 megabytes) hidden Markov model (HMM); 4) flexible, easy modification of spectrum, gain, speaking rate, pitch range of synthesized speech, and other relevant parameters; 5) fast adaptation to a new speaker; and, 6) more predictable synthesis for pronouncing name entities. With its easy training and compact size, the new HMM is ideal for quick prototyping of a personalized TTS.



Update: Google is on it. With Gvoice they're offering free transcription of voicemail (read: free labor for debugging) with a rating to let them know whether or not the transcription was accurate. Obviously they're recycling this back into the system to tell it where it fails. Soon they'll enable a feature like "translate this message". Then you'll be able to translate your instant messages. Then you'll be able to transcribe video conversations. Then you'll be able to translate these transcriptions. Then you'll be able to do it in real time and we'll have opened pandora's box.

http://bigthink.com/zacharyshtogren/your-next-translator-may-be-a-robot

One outstanding task on the global conversation to-do list is how to communicate across languages on all our various new media. Now, a linguistic brain trust at MIT has stepped in to develop a real-time solution to not understanding each other.

The approach, pioneered by Pedro Torres-Carrasquillo of MIT’s Lincoln Laboratory, requires audio mapping a speaker’s low-level acoustics—the intonation of vowel and consonant groupings. Pedro Torres-Carrasquillo found that by focusing on these tiny parts of spoken language he could arrive at a much more accurate identification of a particular dialect than analyzing phonemes—a language’s word and phrasal groups.

Though a few years away, the real-world applications of the work is sweeping. In a surveillance context, low-level acoustics mapping could let a wire tapper narrow down a criminal’s location by dialect. From just one utterance, a phone system like Skype could identify a speaker’s language and regional dialect for the common user.

3. Virtual Reenactment

See Original

In this world, passive learning and the discovery of information become effortless and constant. At the same time, the tools to help sort through this tangled net of data are placed at our fingertips. I've already talked about the universal time line idea, but that’s just one of many possibilities. We could do the same for maps, where one would have access to a moving political and geographical map which could be played forward or backward in time, thus showing the movement of people and ideas throughout history. Key events, battles, quotes (imported from your profile of course), and discoveries would show up as expandable markers. As things were added to your time line, they would be added to your map. In this way, geographic and temporal associations would be formed in a way completely unique to the possibilities of our age. It's often difficult to remember that the same time that Mozart was writing in Vienna, the United States was fighting a war for independence with England. A time-specific map would make this hard to miss. And with better historical associations, better historical understanding would be made possible. For instance, you might tell the map to highlight and show all US occupations and military involvements in the 20th century. This could be compared to all Russian or Chinese or French conflicts, etc. What this shows is still by no means the "whole" story, but at least it is better placed in context.



Walking down the street, historic landmarks would each have Wikipedia like entries which could be read while viewing the object itself. This isn’t a new idea at all- in fact most of the articles are already written and cell phones will do this within the year if they haven’t already [geotagging they’ve dubbed it]. But when combined with transparent visuals we get something completely new. Imagine walking up to the Twin Towers and watching a realistic, stationary CGI simulation of its construction in real size [UPDATE]. Time could be sped up to show the building rise in ten minutes or 30 seconds. Then imagine being able to watch a recreation of the September 11th terrorist attack with sound and visuals of explosions, audio bites of news anchors delivering the information, a montage of newspaper headlines, and simulations of running crowds, yelling firefighters, and lots of smoke- in real size and in a sort of transparent half-virtual reality. Or imagine walking onto a battlefield and being able to see a panoramic, 360 degree simulation of the battle of Gettysburg, complete with overhead maps of troop movement and the ability to hit “pause” at any time. Each of these simulations would come with three or four different levels of realism- after all, we probably wouldn’t want to expose a group of ten year olds to the full carnage of warfare uncensored…There would also be much less depressing examples: the flight of the first plane, a volcanic eruption, a solar eclipse, or a Roman sporting event. And here comes the best part: you wouldn’t have to be at these physical locations. Of course it would be more interesting if you were, but there’s no reason you couldn’t run the simulation in the middle of any empty field, park, or parking lot. This would be an educational dream: “Alright kids, watch what happens to the Spanish navy during this storm....”
If nothing else it would keep students entertained, which brings us to our next point: people aren’t going to use this for work as they are for fun. Ignoring any advanced features, this device is already a portable, full size movie projector which can be linked to watch with friends. It’s also a portable computer, mp3 player, and gaming system. You can imagine any of the examples above being more than just simulations. They could be fully interactive strategy games, where you and other players actually command troops as generals on the field, or you try to shoot down planes before they can hit buildings, etc. This is essentially something like the virtual reality that gamers have been longing for since the days of Donkey Kong, but what’s even better is that it can actually gather information from the real environment to become half real/half digital: mixed reality.
To use a familiar example, people could “carry” their Warcraft characters around with them throughout the day. While turned on, these characters would actually walk through the environment- avoiding walls, traffic, and other real life hazards [all through communication with online maps and visual environmental recognition through the user’s camera]. Waiting in line for something, or just hanging out in a public place, you might see a stranger’s character. He prompts you to fight. For two minutes the sidewalk is lit up with a battle that only the two of you can see. After beating him, you actually walk over and talk about the game- thus creating an opportunity to form a real life friendship through a digital introduction, like a walking social networking site. Coffee and Cigarettes for the 21st century and counter intuitively a way to make people less isolated from the real world.
Anthropocentricism
Strangely enough, mixed reality is in many ways actually more compelling than a complete virtual world and it will hold more lasting appeal based purely human nature. Our psychology is tied to the world we are bound to. Even our fantasies can’t escape: whether greek, hindu, old or new testament, our (no offence) mythological gods behave as humans and are concerned with our affairs. Our cartoon animals speak, love, and fight, as do the robots. Our stories, dreams, and illusions can reach a high level of abstraction, but they are always anchored to reality. There are another type which are not held to this rule, but as these leave the ground they cease to hold their social power. The fantasies of a madman might contain the most beautiful creations ever imagined, but they are either misunderstood or dismissed as irrelevant to those of us around him- what good is a social commentary on the inhabitants of Europa unless they love and hate like us? Until we can accomplish a Matrix like “brain in a vat” experience, freely manipulating all five senses and therefore experience itself, the most interesting virtual reality will be that which we paint on top of the existing world around us. This layered world will be the most important cultural development of our generation, and will affect social interaction perhaps more than anything since speech.

Relevant Links and Updates:



Hear&There
An Augmented Reality System of Linked Audio

Hear&There allows people to virtually drop sounds at any location in the real world. Once one of these "SoundSpots" has been created, an individual using the Hear&There system will be able to hear it. We envision these sounds being recordings of personal thoughts or anecdotes, and music or other sounds that are associated with a given area.

We hope that this system will be used to build a sense of community in a location and to make places feel more alive. Over time, an area such as the Media Lab Courtyard can be filled with sounds from many members of the community so that new members can get a sense of who others in the community are. Then, the new member can drop his or her own sound into the space, adding to the collective definition.

To make the augmented environment as realistic as possible, we use spatialized (or 3D) audio, using Java 3D. This provides important cues to the explorer roaming the augmented environment, as it allows sounds to "appear" to be coming from a particular location in space.

In addition to being able to drop sounds in a space, Hear&There includes a graphical user interface to allow precise control over where a sound exists in space, how large it is, and various properites of the audio.

An additional interesting application of this project is the notion of activation networks. Although the user of the system can choose to explore all of the SoundSpots, they may also choose to take a more guided route. Using this approach, most of the SoundSpots in an area are "turned off." Whenever a user moves into a SoundSpot that is turned on, he or she is presented with the option of turning on other SoundSpots that the SoundSpot's author suggests.

This is the first stage of a project that will branch into new areas in the future. Some questions we may address in the future are the notion of temporal information (so that a SoundSpot changes over time), augmented communication channels within a space, and moving sounds.

2. Out of the Classroom, on to the Street

See Original

So if we had these digital specters which constantly followed our conversations, how else would they affect the way we interact in real life? It's best to start with a two person model. As in the preceding scene, let us assume that the people talking have already told their devices to link up, meaning that they'll feed each other streams of their owners' speech converted into text. Since the devices "follow" this discussion, they can be both helpful and transformative for the way we converse:

Nigel: "...not a simple test of brawn with no strategy or intelligence- that's why soccer's the real man's sport."
Joe: "Are you kidding? Just a bunch of guys running back and forth and never scoring. It's no wonder baseball has so many more fans."

At this point both Nigel and Joe hear an advisory chime as their respective devices pull up statistics proving Joe clearly wrong in his statement that baseball is more popular [overall sales, viewers, stadiums, and media coverage statistics]. Or it could provide qualifiers such as "....so many more fans in the United States", which would allow the conversation to take a different turn. As long as the feature is turned on, the device would tirelessly follow whatever conversation it hears, no matter how boring or absurd, filling in words when needed, providing evidence to support a claim, and even providing statistical polling of the general populous, or a specific demographic. [If this sounds far fetched, just look at the progress that wolfram alpha has made in understanding natural language. Also remember that every conversation is being recycled as input to enhance the system as a whole, the way in which Google improves itself through use.]
In another context we see two people talking about a book:

Alice: "You know most of these other authors are really contrived and don't say much, but he's not bad."
Matthew: "...yeah, he's really the best. I remember this one part where he said something like, 'he who doesn't know, doesn't know he's knowing'....er.. um….'he who thinks.....'"

While Matt desperately tries to remember the phrase, his device has been following his train of though and knows what book he's talking about, so it simply performs a search of what he's saying: [he + who + thinks + know + knows + knowing + doesn't] and before he's even done stuttering his way toward a bastardized version of the quote, his visualizer shows him this:

"He who thinks he knows does not yet know what knowing is."
-Michel de Montaigne from On Conversation

Alice: "....wow, that's great. Send it to me."

Matthew taps a button, sending it to Alice's device, which stores it in her quote bank. Since this collection of sayings is specific to her personal device and thought process, her machine will also know to search these first when she's trying to reference something in conversation, as opposed to scouring the whole internet- like conceptual RAM storage, enabling quick recall.

You can already see the desire for this type of thing: …create a personalized library on Google Books which allows you to label, review, rate, and of course, search a customized selection of books [quote bank]. These collections live online, and are accessible anywhere you can log in to your Google account. Once you've built a collection, you can share it with friends by sending them a link to your library in Google Books... .


This entire process is a way for the digital to compensate for the inherent problems of organic consciousness. Since Matthew had read this quote, been effected by it, and stored it, we know that he had placed special significance on the message behind the words. Sadly, however, no matter how influential these words have been on the way he thinks, speaks, and lives his life, the quote itself might become blurred in his mind as the lesson is committed to memory:

"...for when scatterbrains that have not the quote, do by nature the things of the quote, these, not having the words, are quote unto themselves in that they show the meaning of the words written in their hearts."

So for people like Matthew who have absorbed the concept (unzipped, hard to transmit) while loosing the quote (small zipped file containing more information than its size would indicate), this system would enable them to momentarily regain the original intellectual kernel for retransmission, in this case to Alice (who may or may not subsequently "unzip these files" through contemplation).

The system could also automatically remind us of relevant pieces of advice as we go about our daily business. Say Alice goes home and starts reading the Times online. She finds out that a senator she formerly respected was just caught shuttling cocaine wielding hookers into the Hamptons with government funds. She utters a simple "goddamn..." under her breath, which her device hears and analyzes, "realizing" that she is upset about this article (it’s keeping track of what she's clicking on and reading). It then reminds her of one of the quotes she's put into her bank but might have forgotten:

"Corruption of the best becomes the worst."

She might take a few seconds to rank this quote in terms of appropriateness. The device records her relevance rating and submits it to the larger online system which then better understands what quotes are and are not relevant to an article dealing with the subject of political corruption. It's doing this all through a simple analysis of key words and their frequency of use mixed with the results of how people have rated its past suggestions. Perhaps earlier that day before many people had given their input, someone else was reading this particular article and their device, realizing it was about politics [senator + democrat + funds + Washington + hookers], suggested the quote, "The most important political office is that of the private citizen." Our user then told the system that this phrase was hardly applicable to this situation. So based on trends of how people rate certain quotes in relation to certain articles, the system starts to "understand" the relation of subjects, and by the time Alice gets to the article, it has a pretty good idea of what works and what doesn't.
In this way, each of us will individually develop a database of a few thousand pieces of wisdom that our devices will help us remember through a Mnemosyne like process. [By the way, if you don’t understand this, take the time to read about “spaced repetition”, because it’s worth it.] These don't have to be limited to small quotes. They could be relevant images of movies, political cartoons, famous photographs, pictures of our friends, sound bites, etc. etc. etc. Basically, anything we don’t want to forget. The system could also be set in ratios of "discovery mode" which would mean that it would occasionally show you pictures, quotes and media that you had never encountered before. This could be things your friends found cool (Del.icio.us model) or interests of "people like you" (who enjoy similar articles based on their ratings profile), or if you become interested in a certain subject you could turn on a filter to give preference to anything involving Glen Gould or the Cuban Missile Crisis, for instance.

1. On Education


This device could and should have an impact on our educational systems. Basic ideas like spaced repetition and individually tailored student profiles would allow us to learn much more efficiently.

See Original

“It is the medium that shapes and controls the scale and form of human association and action. -Marshall McLuhan

In a world where this system is a way of life, almost every aspect of our routines would be changed in some way. We’ll start in the classroom: A teacher is lecturing to a group of students, and has told her device to link up with those of the students in her class. As they listen to her speak, her device automatically converts everything being said into text and relays the transcription to each of the students' machines. This would provide notes for later studying without manually taking them down.
Additionally, as the lecture unfolds- let's say on the subject of nationalism and its role in the events which led to the first world war- each of the students' devices would behave uniquely. For instance, in a very simplistic application, if I were listening to her speak and she used a word I wasn't familiar with- "intemperate" let's say- my device would catch it and immediately look up and display the relevant definition on my visualizer (which could be ignored at the swipe of a finger, or course).
But how would it know the relevant definition? Through context. It would look at the entire phrase: "....as they should have known, but through intemperance and recklessness in their formation of alliances, European powers were caught in a web of....." Through simple logic (statistical, frequency based analysis of its use) my device would know that she had not implied the secondary meaning of the word- that the leaders of Europe had been signing treaties while drunk- but rather a synonym of excessive.
But how would it correctly guess that I didn't know the meaning of intemperate, as opposed to any other word throughout the lecture? Again, through simple statistics. It would have been keeping track of what articles I had been reading online and in print (remember it has ever watching "eyes", constantly converting street signs and every day papers into text which it archives). It would also have been keeping track of the words I use in conversation and that other people use while talking to me, dialogue from movies and television I had recently watched, and even song lyrics I had heard. From this entire collection of my language-based interaction it would then be able to make a well educated guess about whether or not I understand a specific word, inform me of the definition, and remind me of it from time to time.
“By ‘augmenting human intellect’ we mean increasing the capability of man to approach a complex problem situation, to gain comprehension to suit his particular needs…”
-Douglas Engelbart, 1962

While following the lecture, my device will also interpose in different ways. For example, my professor might continue talking about the networks of allegiance that forced the onset of such a war. Studies show and practice proves that a web of visual and conceptual associations weave together in our minds to form memories- we call this learning. To assist my learning process then, the device would also follow the teacher's speech in order to periodically pull up relevant images from online: "...and having each sworn alliance not only to each other but even to the allies of their allies' allies, nearly every European country was entangled in..." and suddenly this picture would pop up:

By seamlessly combining the two educational senses- hearing and sight- learning could be quicker and more efficient; there is a reason that political cartoons and other visual allegories have been so effective throughout history:

"Stop them damned pictures. I don't care so much what the papers say about me. My constituents don't know how to read, but they can't help seeing them damned pictures!"
-Boss Tweed on the political cartoons of Thomas Nast which did eventually remove Tweed from public favor and political power.

The balance between universality and individual tailoring would benefit the user incredibly. Having gone through over a decade and a half of school I can't even count the amount of times I've learned about the First World War. To my regret however, I couldn't honestly tell you the date of the assassination which initiated it. I must have re-learned it half a dozen times at least, and yet I would still need to consult with the internet for a few seconds before comfortably writing about it. I can't be sure as the exact cause of my faulty recollection, however I believe it has something to do with the fact that it was always taught in relation to a different aspect of 20th century history (American, World, first half, second half, art, culture, war, etc.) It was therefore placed in proximity to different concepts and in a different temporal position in the mental time line which I had formed to pass that specific class or test and subsequently forgot. If this event had been shown to me on a visual time line that automatically added newly relevant dates and remained steady throughout my entire educational career, on the other hand, I’m sure I would know it by heart.


“…developing the new methods of thinking and working that allow the human to capitalize upon the computer’s help.” -Engelbart



This is the real significance of such a device, which from grade school on could build up a single historical time line and a malleable visual model to go along with it, hand tailored for each student and therefore relevant to their unique interests and level. When paired with Mnemosyne like techniques, these dates would be committed to memory in no time. Easier and more thorough memorization from a young age would actually facilitate a better understanding of history as a whole, from a broader perspective. Such a tool for the average person would be humanistic in the best sense- not to mention the telescoping effects it would have for young minds. We would actually increase both the quantity and quality individual knowledge; if we do this for every individual, we do it for our entire society, which would become uniquely equipped to understand the present through a firm foundation in the past: “We learn from history that we do not learn anything from history.”

What effects would a non-forgetful public have on democracy?


Back in the classroom, this device would also affect the lecture itself. Imagine that beforehand, everyone in this class had submitted their academic "profile" to the professor. This profile contains a great deal of valuable information: what books each student had read or studied, their built up historical time lines and level of relevant knowledge, classes taken, places visited, articles read, movies viewed, etc. A daunting amount of information, but analyzation software could trim it down to a usable pulp, allowing the professor to better understand the specific needs of each individual and the class as a whole. A pre-lecture assignment could even be given on the basis of this knowledge: "I see that only two of you have read Zinn's history of the United States. Be sure to read chapter 14 by Monday, when we will discuss America’s role in this whole mess," making the class more efficient for all involved, while instantly standardizing a new universal education system. It would give you an immediate clue into the personalities of the students in your class or a new student transferring from a different school. [On the other side, this also sadly enforces the pattern of the past years, wherein statistics take precedent over factual accounts or experience, and everything is usefully but falsely compressed into digits. Claiming to know a student based on what books he's read is like claiming to know a person based on what music they listen to or blogs they read. Like beautiful penmanship under the current keyboard tyranny, certain things must be shed to gain the advantages of digitalization. I talk about this in the third section of this essay.]

Saturday, January 16, 2010

Prologue: the Invention

A description of the electronic device that we'll all soon be wearing, a discussion of the reasons why it's going to be so popular, its uses, and the philosophical effects it holds in store for the individual and society as a whole.

See original

Technical:
This is an extra set of mechanical eyes and ears which record every detail of our lives while feeding us additional audio visual material, thus simultaneously recording and augmenting experience.

Each of these components are controlled through a wireless “brain” which we can imagine as something like a current cell phone, carried in the pocket to send and receive information.

Visual: A set of media glasses are worn. [very ugly contemporary examples] These come equipped with two cameras, one on either side of the face, which move in unison and record exactly what the user sees. These are further controlled through sensors on the insides of the glasses, which monitor eye movement just like those in modern digital cameras which can detect what the photographer is looking at in order to decide what to bring into focus within the shot. By having two cameras stationed on different sides of the face, the device is able to get a sense of depth, the same way our eyes do.
For output, these glasses have displays built into the glass like a HUD. Because they are displayed onto glass, these images layer onto the real world like transparencies in an old style projector. The user can still see what is happening in front of him. The virtual images can be increased or decreased in intensity, allowing for more or less of “reality” to reach the user. Although this might sound far fetched, there are already working models by various companies which are soon to be marketed as the new necessary luxury. The simplest application will be to provide a mobile full size computer screen. These companies claim that the view is like that of a sixty inch screen viewed from ten feet away. So we'll never be away from the internet and all of its diversions and while riding the train, we’ll be able to watch full size movies.
But simple mobile entertainment is nothing compared to the new possibilities of ubiquitous computing and real time sight/sound manipulation, some of which can be read about below.

Audio: Worn in either ear will be something resembling the modern Bluetooth. It is a combination microphone and headphone, which listens to everything the user says and hears, recording it, transcribing it, and archiving it in the CPU.
The use of microphones in both ears provides stereo hearing, enabling sound location, in the same way our real ears do. The device will know on which side of you someone is talking by comparing the volume picked up by either microphone, for instance. Headphones in both ears provide stereo output. This can be used to create virtual depth of non-present objects. If the visual display is animating a virtual waterfall in your living room, these headphones would provide realistic sound that would change based upon your special relation to it. As you turned your head, the waterfall would become louder in the appropriate ear. [Click here for an entertaining* example of this type of headphone illusion] The way in which we listen to music would also be affected. Outside sounds could be picked up by the device and literally mixed into whatever song the user is listening to as they walk down the street. I talk about some of the infinite possibilities below.

Control: By freeing the user's hands from holding a cell phone screen, we also lose the convenience of a keyboard and other buttons. This is a problem. To create real seamlessness, the user must be able to control this device in a very quick and unobtrusive way. Various ideas have been proposed, but I see the least awkward solution as a pair of wirelessly linked wrist-bands, which monitor the finger, hand, and arm movements of its wearer, transmitting the results as commands to the CPU. These would work by communicating with each other and the glasses, in order to determine relative position through triangulation. They would be tight fitting enough to feel the tendons in the user's wrists, to determine finger movement. It could tell which finger the user was using to point at something without the user actually having to wear bulky gloves undergoing surgery to install sensors under the skin. A whole range of natural movement could be interpreted as commands. [taking a picture of something by forming a box with your hands, changing a song by snapping your fingers, etc.] Other people have already created basic versions of this. Whatever hardware becomes popular, the important concept is for the device to detect and respond to natural human gestures in a convenient and unobtrusive way.

Profiles: This CPU will channel every aspect of the user's experience into some sort of quickly accessible memory. What words we use, what articles we read, what movies we see, who we hang out with, what we eat, and even how often we shower. Anything useful or interesting. Everything is converted into statistics, which like text files, take up almost no memory. Indeed an entire lifetime full of conversations turned text could be stored within a single gigabyte, which can currently be fit into a space smaller than a fingernail.
This information will be used to compile a database, just like a profile today on websites like Myspace and Netflix, however it will be an expanding profile which chronicles and eventually influences every aspect of the user's tastes, dislikes, and activities. Profile information is not typed in. It is auto-completed while the device watches everything its user does. This level of individual digital understanding leads to some of the most amazing and disturbing possibilities created by this new system. I talk about them all over, but especially in my essays on food, education, conversation, and death, ranging from obvious to revolutionary, to downright twisted. This is the philosophical realm in which this machine holds such mixed potential.

Philosophical:
For the purpose of these essays, “the device” as I call it, is a collection of hardware and software, the culmination of millennia of technical and artistic advancements, and an ever expanding system set to understand and manipulate the way in which we experience the world around us. It is an extra set of ears and eyes, hearing what we hear and seeing what we see, while feeding us additional sounds and images based upon what we ask of it. It is a digital storehouse of organic memories where entire lives of information are kept and archived automatically, improving upon the limits of our own inherently forgetful minds. Similarly, it is a way to solve many organic limitations. Increased and augmented senses enable us to see and hear what was previously beyond our natural grasp.
And the best part is that it is already possible. The challenge is just putting it together and getting people to use it. These essays then, serve as a call to wakeup and confront the possibilities at our doorstep. Taken in their parts but especially in their entirety, they also serve as a heavy warning. In speaking about death for example, I've come across possibilities for the remembrance of loved ones which are as beautiful as they are frightening- and we have to keep in mind that every invention is a double edged sword. When thinking about the future I don't imagine human resistance against an outside enemy. I imagine our own defeat from within: the enslavement of ourselves, by ourselves. Like an ethereal battle, there will be no concrete entity which can be fought against. There is no mastermind behind the system. The grip is of our own human nature on itself, by itself. By the same qualities which originally led us out of the jungle and onto the farm, the desire to live an easier life no matter what the cost- even through the expenditure of more immediate effort- from these same gluttonous desires we will be enslaved to our pleasures and perpetual satisfaction. This is the defeat of the conscious by the subconscious, the defeat of the conscience to the over soul, of the individual to the pacified mass. That is, of course, if we do not take the proper means to understand and ensure otherwise. There are as many potential benefits as there are risks.

If, on the other hand, McLuhan’s assertion that any invention contains its effects regardless of the context or way in which it is used, then these essays will merely serve as an extended definition of what, exactly, this highly multifaceted tool really is.

Most importantly, it is not my invention. Merely the countless inventions of others which I've copied and pasted together in order to study in my mind. In doing so, I've become convinced of the similar desires in other people, both historical and otherwise, which points to the inevitability of a tool which has as many potential disadvantages as positive traits. Instead of trying to delay a creature which has the weight of thousands of years of progress behind it, I suggest we take a long hard look at its pros and cons, in order to better cultivate the former before this way of life sneaks up unannounced.

Thursday, April 16, 2009


myspace layouts