The Sword of Damocles : 2017

Monday, July 17, 2017

Virtual Reality and the Trough of Disillusionment

For Christmas last year, I picked up a $10 Insignia Cardboard VR for "my kids" stocking. I wasn't expecting much out of a piece of cardboard with two magnifying glass lenses, a magnetic button and a piece of elastic.

I was blown away.

After downloading a few apps for my Sony Xperia phone and almost melting the phone itself, I was amazed at how the seemingly simple visual magic trick of splitting a screen into a screen-per-eye immersed you in an alternate reality. Everyone I showed it to was similarly impressed. I tried it with a new set of Bluetooth headphones and was standing on stage listening to a Binaural immersive audio concert and watching Paul McCartney play Live and Let Die on the piano a couple feet from me.

So why isn't everyone talking about it and wearing these things outside?

Well, Cardboard VR looks pretty silly. It's, um, cardboard. My kids liked it, yet they don't talk or ask about it as they do with the IPad, XBox, or even Pokemon. It heats up my phone and chews up valuable memory space. It's blurry. You have to hold it to your face. It doesn't have a lot of easily-discoverable content. You have to start an app from your device before you put it on to get it going. There is no keyboard or mouse. Talking to yourself (since you can't really see a device) isn't like talking to Alexa or Siri or Ok Google or Xbox.

Perhaps my kids would use it more if we had a more professional headset or "kid-friendly" one that didn't require a phone. I don't think so though. In any case, it's probably worse than an IPad in terms of the health and mental changes it would introduce to children, so I don't think it would be a good idea to let my kids play with it anyway.

"With appropriate programming such a display could literally be the Wonderland into which Alice walked." -- Ivan Sutherland

Running a dual split-screen display on a mobile device will burn your battery like no tomorrow. At up to 90fps for a good VR experience, it's no wonder you need a $1000 gaming rig to get the best experience out of some of the VR devices like Oculus Rift and Steam VR.

Searching Github for VR brings 17,000 repos. GoogleVR is number 3 returned best match. Facebook 360 is up there. WebVR and WebGL are going crazy. There's the Jahshaka VR Content Creation Suite. 360 Video is the Next Big Thing™.

So why is Virtual Reality stuck in the Trough of Disillusionment?

The Gartner Hype Cycle shows that technologies flow in cycles over time. At the Peak of Inflated Expectations, technology and startups are "changing the world" and the press is clamoring for articles and content on the technology. Everyone may not have the tech, everyone wants it and they don't really know why. There are successes and many more failures. Early adopters make-or-break the product or technology based on their loyalty, interest, and evangelism.

And then a quick dip into the Trough of Disillusionment, where interest wanes as promises are not delivered. Investors didn't get the quick return they were looking for. Maintenance, Support, and Legal aspects of the technology start creeping up. Everyone wants a piece of the pie, and the pie is getting a bit hard and crusty.

In the news recently, a $6 billion dollar case against Facebook and Oculus was lost, with a judgement for $500 million to ZeniMax. Another case announced a couple days ago. The pigs are feeding at the trough, not to equate anyone to pigs however the saying seems to fit.

I'm a big fan of the 2 Johns (Carmack and Romero) and their impact on the gaming world with id software, They spawned a billion-dollar industry by starting with the concept that Super Mario Brothers 3 should be doable on the PC, outside a protected cartridge on a Super Nintendo. By creating Dangerous Dave in Copyright Infringement, John Carmack brought the concept to reality, while mocking the concept of copyright itself. Shareware was in business.

John's .plan files are an interesting historical trip through the mind of a game developer and 3D pioneer.

"Well, I have learned enough about it. I'm not going to finish the port. I have better things to do with my time." - John Carmack

His OpenGL position in January, 1996 tells a story about the state of the art in 3D and the heated battle of competing 3D standards within the hardware and software vendors and developers of the time.

John Carmack, May 14, 1997 .plan file.

"I am still going to press the OpenGL issue, which is going to be crucial for future generations of games." - John Carmack

In 1996, 3dfx Voodoo cards were awesome, and I still have one in my basement shop tech graveyard. They surpassed console and arcade hardware on the PC. Mine was unstable as hell, overheated my cpu, was incompatible with some games and crashed my PC all the time. The bang for the buck and wow factor overcame that.

Matrix, a company based in Quebec, had their high-end prestige Millenium cards and released Mystique to compete with the Voodoo's price-point. Nicknamed the "Matrix Mystake" it didn't hit the mark. Nvidia and ATI, a Markham, Canada company, took over the market. NVidia bought 3dfx. Matrix is still The video card industry had just started its exponential rise to meet the demands of gaming and video software. Technology was diverging and converging, and commoditizing. The climb to the Peak of Inflated Expectations was occurring.

"Many things that are a single line of GL code require half a page of D3D code to allocate a structure, set a size, fill something in, call a COM routine, then extract the result." - John Carmack

In 2011, Carmack suggested that Direct3D had surpassed OpenGL, though he still wouldn't use it.

In 2015, Microsoft brought DirectX 12 to Windows 10.

Lead developer Max McMullen, stated that the main goal of Direct3D 12 is to achieve "console-level efficiency on phone, tablet and PC"

Vulkan is a "closer-to-metal" API for hardware-accelerated graphics, and operates at a lower-level than OpenGL . At the time, Valve suggested it made no sense to use Direct3D 12, and to stick with Vulkan, though it couldn't be used commercially(?). Direct3D 12 was really the only commercial option.

Also in 2015, Microsoft announced GPU Capabilities in Azure.

2016. OpenGL vs. Direct3D: Who's The Winner of Graphics API

2017. OpenCL, OpenGL, OpenVX, Volkan, WebGL. DirectX12. However, gamers are no longer the only consumers of leading-edge graphics technology. AI, Deep Learning, and GPU compute are the key use cases for the technology. Clusters of machines running high-end GPUs are no longer used to display graphics at all. Display Drivers have become Virtual Compute Drivers. Microsoft has released the Azure N-Series NVIDIA CPU Virtual Machines.

So back to VR. The key mainstream platform for graphics and VR in 2017 is mobile. The key mobile device is Android. And Android runs OpenGL. Macs run OpenGL. Does that mean OpenGL wins?

As Facebook says, It's Complicated. Valve developed a wrapper to translate Direct3D to OpenGL. Unity will produce both OpenGL and DirectX. WebGL and WebVR are slowly becoming platform-agnostic mainstream technologies. DirectX is still Windows, and OpenGL is still everything else.

The Slope of Enlightenment will come to VR, and it will be when software, or virtual hosted hardware replaces local hardware. When DirectX will support accelerated 3D over a network similar to VirtualGL, or Desktop Cloud Visualization. When the requirements for immersive virtual reality are that you have to look at an object and its reality is projected onto it, rather than putting on a device that projects it to you.

Push VR will make VR, and more realistically AR, a commercial success and commodity "necessity".

Tuesday, February 14, 2017

ADAM - Production Design for the Real-time Short Film

Georgi Simeonov and the team at Unity published a series of blog posts on a demo called ADAM.

The film is set in a future where human society is transformed by harsh biological realities and civilization has shrunk to a few scattered, encapsulated communities clinging to the memory of greatness.

Adam, as our main character, was the starting point of our visual design process. He was designed to provide a glimpse into the complex backstory of the world, by revealing himself as a human prisoner whose consciousness has been trapped in a cheap mechanical body.

My computer can't even handle playing back the video without a bit of stutter. It's like a message will pop up any second telling me "this isn't something I can even fathom playing back for you in a timely fashion." I can imagine what a $600 graphics card could do with this technology.

The amount of thought and depth that went into this short demo is staggering. What I really find interesting is the concept art and reference sheets. Google Goggles and Machine Vision seems like a great tool for building these otherworldly characters. Take a picture of someone and it will classify, recognize color and text, identify brands and bar codes, and search for related images. We could tweak the Cloud Vision or Microsoft Cognitive Services Computer Vision API technology to generate these reference sheets automatically, and provide additional insight and ideas to the realtime art director.

What if a tiny device sitting in the middle of your living room could model the room with 360 depth video, add ceilings, floors, render light sources, color match, visually classify the contents and determine what would look just a little bit out of place yet matches the color, lighting scheme, and design aesthetics of the space.

The tough problem is no longer modeling a room in realtime - I can do this with a Kinect and Skanect in about 5 minutes.

Well, there's still a few kinks to work out...

The tough problem is making this technology portable, non-intrusive, and insensitive to light sources like the Sun. The ability to use laser technology to capture accurate depth and distance across hundreds of yards of indoor or outdoor terrain. A way of bringing the experience to the individual, rather than bringing the individual to the experience.

What if you could take an experience like watching a hockey game, render it realtime in VR, add binaural audio and throw in a few of your closest friends from around the world.

There's still some room to grow with the APIs though. I'm pretty sure there's plenty more Joy, Sorrow, Anger, Surprise and Headwear at this hockey game.

https://blogs.unity3d.com/2016/07/07/adam-production-design-for-the-real-time-short-film/
https://unity3d.com/pages/adam

Assets here
https://blogs.unity3d.com/2016/11/01/adam-demo-executable-and-assets-released/

Wednesday, February 8, 2017

RGB-D Datasets

Huge list of datasets with Depth & figure detection.

http://www0.cs.ucl.ac.uk/staff/M.Firman/RGBDdatasets/

Tuesday, January 3, 2017

Deep Learning, 8-bit Dimensionality and Toys of the World

Google announced they are open sourcing the Embedding Projector: a tool for visualizing high dimensional data, a hosted flavor of Deep Learning tool TensorFlow. Last year they announced TensorFlow would be open sourced. Here's an overview of what TensorFlow is and how it may have been disappointing last year.

Partying like it was 1999, I was working with Microsoft Site Server, analyzing web logs in multiple dimensions for the largest telecom company in Canada. Standard dimensions for a web log would include host, referrer, uri, date, time, code, client ip, browser, and the mess that was cookies and session ids, One tool that sticks out for me was Microsoft Site Analyzer. It was the coolest tool I had seen (out of Microsoft, at least) for visually understanding path analysis and links within a web site.

A huuuuge list of links to Neural network resources, published in 1999.

The interactivity of the tool and bouncing, spiderweb physics really made the it fun to use. Though it didn't scale so well once we got up to 50,000 pages of content and 10GB of web logs a day... and I think I was the only one actually using it. Most of the e-marketers were just concerned with how many visits there were to their respective campaign pages, and less concerned with what page links to where or how people got to a page in the site.

Microsoft Data Analyzer was another tool for visualizing multiple dimensions that didn't really take off as well as the Proclarity's and Cognos BI tools of the world.

A game-changing demonstration of interactivity, multi-dimensional data visualization, and time-series animation was with the Gap Minder bubble chart demo and its origins from Ted Talks by Hans Rosling. Hans conveyed an exciting story of the world's population & life expectancy data like a horse race, showing the staggering growth of India and China across the centuries and the effects of war and famine on the world's countries.

Dollar Street, the latest tool from GapMinder, uses 30,000 photos, from 240 families in 46 countries (excluding Canada) and the stories of families, their annual income, and the dimensionality of their lives, including what toys they have and how they brush their teeth.

Toys on different incomes

It is a great example of the power of data combined with multiple dimensions of visual imagery, personal stories, and heart-breaking reality. The story, context and imagery behind the data points is much more important than just the numbers themselves or plots on a graph.

What does it mean to think in 4-dimensions?

For a more lighthearted example of a different way of visualizing 3-D data, here's Wolfenstein 1-D, Wolfenstein in a single pixel line. Not as good as Snake... or this Deep Reinforcement Model which beats Snake.

In the 90's, John Carmack of id, Wolfenstein and now Oculus Rift fame put together a side-scroller called Dangerous Dave, to clone Nintendo's Super Mario Bros for the PC. At the time it was amazing.

Now we are in 2017, and we have Nintendo releasing Super Mario Run, a mobile side scroller that does the side scrolling for you. And it's amazing? Nostalgia sells, just ask Pokemon Go players.

People tend to avoid eye contact when speaking, as their thoughts and verbal responses are more easily processed when not distracted by the visualization process. This may explain the overwhelming feeling you may get when viewing something the brain cannot easily process, such as a multi-dimensional Virtual Reality video that shouldn't exist in your current reality space.

It may also explain why it's still fun to ignore having to move a character in multiple dimensions and just tap a screen like a slot machine while hurling uncontrollably right and jumping on Koopa Troopas and Goombas.

And why the 8 bit workshop Atari 2600 repl is Amazing.

This is amazing! Zero friction interface. https://t.co/y91kss0QM7
— John Carmack (@ID_AA_Carmack) December 31, 2016

Deep Learning with Atari