It is very difficult to predict the future. But it is not easy to predict what the future is not.
I am convinced that the most important technology in the future world of LUI has not yet emerged.
In the animated series, collecting all seven Dragon Balls can summon Shenron. Now, LLM like ChatGPT is definitely one of the very important Dragon Balls, but there must be another particularly important Dragon Ball that has not appeared, so it will take us some time to summon Shenron.
In the era of 2G mobile phones, it’s about to enter the 3G era, probably around 2004. At that time, as a staunch optimist of the mobile Internet, I believed that mobile phones would definitely be the future. But looking at the Dopod in my hand, using the plastic buttons and small screen, I always feel that it’s not right to say that this experience is better than PC’s with a clear conscience.
For the mobile internet, the industry has been discussing for at least 5 years, until the birth of the iPhone in 2007. Originally, in addition to the Computing Power of mobile phones and 3G, the mobile internet also lacked something that most of us were completely unaware of, which is the multi-touch touchscreen. Jobs used amazing imagination to gather the last dragon ball, and then summoned the Shenlong of iPhone.
In the world of natural language interaction interfaces (LUI) enabled by large language models, every time I imagine it, there is always a thick fog blocking my view, making it difficult for me to see the specific product form of the future.
How will humans and machines interact in the future? Will we still chat with keyboards or use voice? There must be some technologies that we are not paying enough attention to right now, which combined can gather all seven Dragon Balls.
What is this Dragon Ball?
If I have to guess, I would consider AR as the most likely candidate. The idea of Google Glass may have gathered all the other elements at that time, but it lacked a more natural way of interaction. The most difficult problem to solve in Google Glass is the input issue. If a large language model solves the input and glasses projection technology solves the output, it may be a new revolutionary product. After all, before the iPhone, there was Apple’s Newton and other companies’ Palm as pioneers. It is also possible for Google Glass to be a pioneer of a truly revolutionary AR device in the future.
Of course, the only thing I know is that all my current guesses must be wrong.
Note: The photo is of me using the webcam on my Dell computer in January 2004, publishing a video stream through Windows Media Encoder and Windows Media Server, and conducting an experiment to view the video stream using a Windows Embedded-based Dopod mobile phone and GPRS for internet access. This is what we now call ‘live streaming,’ more than a decade later. This once again demonstrates how technology may take a decade to be applied in society. (For specific steps and equipment, you can click ‘Read the original text’ to see the experiment record from 19 years ago)