5 Predictions for AI in 2022

How AI breakthroughs in 2021 will decide its future

10 min readJan 9, 2022

Artificial Intelligence is growing extremely fast these days. The complexity of things around us that grows from time to time has motivated everyone to rely on the premise of AI to assist humans in many ways, from detecting fraud transactions, predicting market price, to making decisions in the high-level stakeholders. The opportunities and resources are boundless, therefore there is no reason any enterprise is not trying on AI. Therefore, we should know where AI will be heading after this. Here is a brief recap of what had happened in 2021.

A short timeline of AI headlines in 2021

A short timeline of AI throughout 2021 (Courtesy: Yohanes Nuwara)

The good story of AI throughout the year actually began in February when Google released Tensorflow 3D to upscale deep learning models into the 3D space enabling 3D scene understanding that could be used in virtual reality, point cloud applications in imagery, LiDAR, and vision for self-autonomous driving cars. In March, Facebook (now Meta since October) released their self-supervised learning called SEER that is capable of doing the unsupervised task of recognizing texts, images, and other unstructured data largely available in social media. The SEER is built on ImageNet pre-trained on a billion random, unlabeled and uncurated public Instagram images. There is no new thing in April, however, the European Union proposed new regulations for AI to provide a legal framework for AI horizontally in the region. The proposed legal framework focuses on the specific utilization of AI systems and associated risks.

In May, Google released Vertex AI, which is integrated with its Google Cloud service to enable building ML using the power of automated ML (or AutoML) based on pre-trained APIs for vision, video, natural language, and many more. With Vertex, the complexity of running ML pipelines is removed by bringing ease of coding (low-code development). Then in June, Microsoft’s GitHub released their GitHub Copilot that enables users to accelerate coding by autocompletion. Autocompletion is, before someone finishes their code, GitHub Copilot will complete the code by itself. In July, Google’s DeepMind released the predicted shapes of more than 350,000 proteins using their AlphaFold AI system developed a year earlier. Some claimed that this database could revolutionize in many aspects such as accelerating the ability to understand diseases and develop new medicines.

In August, researchers from Carnegie Mellon University and Massachusetts Institute of Technology published a groundbreaking invention of a new kind of Generative Adversarial Network (GAN) that can produce imitate images only by sketching them which they called GAN Sketching. Then in October, NVIDIA combined two powerful language transformers to create the Megatron-Turing Natural Language Generation (NLG) which exceeds OpenAI’s powerful GPT-3. This transformer model is created to improve training efficiency by 10-fold based on hundreds of billions of natural language tokens with GPU acceleration. In November, again NVIDIA released the next generation of GAN called StyleGAN3 that could generate imitate human photos almost 99.9% realistic. Finally, in December, DeepMind released another natural language transformer model called Gopher that could synthesize response in human-computer interaction.

Based on these breakthroughs, here are what I see as the 5 most likely predictions of AI in 2022.

Prediction 1 — AI will be more explainable and automated in business

Mike Connel, the COO of Enthought, said that more than 90% of industrial AI or ML projects are likely to fail to achieve their business objectives in 2022 because of the failure to explain their models in the business. The Explainable AI (or XAI) becomes so crucial because of a threat that poses many enterprises, for instance, a model that produces biased outcomes. The high-level stakeholders of business will focus on the question of how the model represents the real business problem, instead of how to code it. Besides, the need for low-code ML model development for enterprises becomes so inevitable, because developing one model can be tedious and iterative. Automated ML (or AutoML) has developed so rapidly in recent years.

The H2O driverless AI dashboard (Source: H2O User Guide)

Currently, there are few companies that develop XAI and AutoML. One of the most popular companies is H2O.ai, Inc. The H2O.ai names its service as Driverless AI. This AI service enables data scientists to develop machine learning models faster because of the AutoML capability to pipeline all processes automatically (from data exploration, model selection, and evaluation) and to explain the models using visualizations for business stakeholders. This avoids the perspective that machine learning is a “black box” model. Additionally, Google provides XAI as the stack to its Vertex AI cloud platform that they released earlier in May this year. Therefore, this will inevitably propel the development of automated and explainable ML models in the cloud more rapidly in the coming year.

Prediction 2 — AI in 3D will revolutionize Autonomous Driving cars

The self-driving or autonomous driving car is a concept of driving with less or no control by the drivers. The cars may have sensors that take the role of vision sensing of their surroundings and a system that digests this sensory information to control the movement using the so-called computer vision. Nowadays, most self-driving cars such as Tesla Model X with the HW2.5 autopilot system use 2D object detection based on Convolutional Neural Network algorithms namely YOLO (You Only Look Once). The images of the surroundings captured by LiDAR camera sensory instruments only process the information as 2D objects (or depthless objects). This may limit the capability of self-driving cars to make autonomous decisions.

A 3D vision vs. 2D vision of self-driving cars (Source: Libor Novak’s Master’s Thesis in YouTube)

Recently, there have been movements to upscale from 2D to 3D monocular scene understanding. Since 2017, there have been at least four algorithms known as the 3D Bounding Box Estimation algorithms that attempt to work on this movement, such as Deep3DBox, FQNet, Shift R-CNN, and Cascaded Geometric Constraint. There are also other approaches for example the Pseudo-3D approach. With the advent of Tensorflow 3D in 2021, the future of self-driving cars that learn the surrounding objects in 3D is very promising. Since many people use Tensorflow to work with deep learning according to Forbes, using the new 3D release will be very easy. The implication on other applications as well such as the preservation of historical buildings in the Augmented Reality space will be also visible in the near future.

Prediction 3 — Generative Adversarial Network (GAN) will revolutionize design manufactures

With no doubt, GAN is the one that people look for when talking about things like generating human faces that looks like someone’s face in Deepfake, creating new paintings that use the painting style of Vincent van Gogh, or producing photorealistic images of a bottle of wine. In recent years, GAN has revolutionized art studios and film-making industries.

In the near future, GAN will land on manufacturing enterprises. Recently, GAN has the capability of learning from 3D rendered objects and producing 3D objects. In science and technology terms, this is called Computer Aided Designs (or CAD). For example, a team of researchers from MIT CSAIL showed that GAN could learn to differentiate 3D furniture objects from IKEA’s large dataset that consists of photos of furniture. And then, GAN produces the 3D rendered objects of the recognized images. This is a huge breakthrough since manually doing CAD on computer software is a lengthy and expensive process in manufacturing industries.

Images of IKEA furniture and their 3D rendered objects produced by GAN (Source: Bricsys)

In addition, automobile industries will find more applications of GAN in designing vehicles and their components. Monolith AI, A start-up based in the UK, is one of the very few companies that experiment with GAN in product design optimization. They showed how GAN can be applied in topology optimization and mesh-less design generation to produce new automobile designs. This enterprise also implements GAN to generate 3D objects of components that cannot be produced by 3D printing and to solve the Computational Fluid Dynamics (CFD) simulation of those components.

A sequential generation of artificial automobile designs using generative AI (Source: Richard Ahlfeld’s post)

With these applications of GAN in manufacturing industries, AI is likely to be more mainstream in the process of design optimization and additive manufacturing in the next year.

Prediction 4 — Transformers and Cognitive AI will revolutionize language applications

Natural language processing and generation (NLG) has transformed the way enterprises understand their customers from their tweets on social media, find the right facts and identify hoaxes from news, and generate conversations with humans through chatbots. NLG relies on the use of transformers, a deep learning structure that consists of encoder and decoder processing inputs (can be any form of data) and generating outputs (in a form of texts).

The two most popular NLG models are BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer). Google’s BERT released in 2018 has 340 million parameters, whereas OpenAI’s GPT-3 which was claimed as the most powerful transformer has 175 billion parameters. The numbers of parameters grow exponentially. It seems that only by looking at its trend, the competition among transformers is endless. So, what does this mean for the future of natural language generations?

Growth of the number of parameters in deep learning transformers (Source: Microsoft)

Natural language processing can classify good and bad sentiments by decomposing texts into sentence structures (or syntax analysis) and analyzing the relation of words, grammar, and meaning (or semantic analysis). We need to provide as much data as possible to make NLP able to recognize the context. In the future, NLP will be “conscious” enough to understand more from the texts, such as finding the implicit meaning from the writer’s emotion, thus making a Cognitive AI. With billions of parameters now available in the transformers, the emergence of cognitive AI is near.

Cognitive AI will help enterprises give more personalized feedback to their customers after the customers do conversations in the chatbots. The most popular conversational frameworks are Amazon’s Alexa and Apple’s Siri. The contexts of the conversations will be no longer generic, for instance, asking names or occupations. The context will be very diverse and specific, for instance, science, human resource, or medical prescriptions in healthcare.

Almost realistic human conversations with GPT-3

Last but not least, the NLP will be integrated with image processing, such as processing facial expressions, gestures, and body language to understand the emotions in conversations between two humans. Then, the cognitive AI will generate artificial conversations. This biometric-inspired cognitive AI will be prevalent in the near future.

Prediction 5 — Digital twin in the metaverse will dominate industries

A digital twin is a virtual representation of an object or a system that shows physical behaviors in real-time. It is used by many industries these days to understand behaviors and flaws of their products, such as machinery or construction structures, and to improve its physical model through simulations. Companies like General Electric and Rolls-Royce are the forefront leading industries that use the digital twin to improve their jet engine designs.

To represent a prototype in the digital twin version, the prototype is equipped with many sensors that measure different physical behaviors during operation. There could be hundreds of sensors that need to be understood. With the immense training data, AI can understand the complexity of the behaviors and meaning of each sensor measurements, therefore can be used to predict failure occurrence through predictive maintenance or provision.

Microsoft’s HoloLens mixed reality equipment (Source: Microsoft)

Now, companies are developing Virtual Reality and Augmented Reality (VR and AR) experiences to enable people to interact directly with objects that do not physically exist in the real space using their senses. The most common way of doing VR and AR is bringing the experience “on the desk” with limited interactions. Now, technology enables people to interact with objects seamlessly in a real space. The space where people can interact with objects with VR and AR is called the metaverse. In metaverse, people can interact in real-time. In the future, more companies will develop user experience combined with AI applications in the metaverse. Boeing just announced that they will carry out 3D engineering designs of their airplanes in the metaverse using Microsoft’s AR and VR technology called HoloLens. There are some benefits of bringing engineering in the metaverse namely reducing the cost of making real prototypes because of the cost of the components and repeatable experiments on the designs. In the near future, city planning, environmental building designs, and disaster prevention are very likely to be carried out in the metaverse.

Conclusion

The year 2021 has set major exciting breakthroughs of AI for lots of promising applications in the following year 2022. Five key predictions are AI will be more automated and explainable in business, AI in 3D will revolutionize computer vision in self-driving cars, GAN will enable manufacturing designs with optimization, transformers will revolutionize natural language generation and artificial human conversations, and digital twin will be carried out in the metaverse. Given the speed of innovations both in the theoretical side of AI and practical applications enabled with sophisticated technology, AI will keep moving the course and helping humans do a variety of complicated tasks.