Artificial intelligence (AI) has gone a long way and has become quite evident with modern gadgets that employ voice recognition such as the Amazon Echo as well as techniques that employ image recognition like Captcha. Modern AI involves Machine Learning (ML) which is basically teaching machines to recognize and respond to various stimuli.
While it may get technical with terms such as NLP (Natural Language Processing), GAN (Generative Adversarial Network and Computer Vision, the writer is very down to earth. Discussing these topics as one would discuss about the weather, about issues with the neighbor’s dog or the taste of a hamburger. The writer makes AI sound so easy and interesting that it can spark interest for others to enter the field.
Read on as we discuss in summary these deep learning insights and visit Wikipedia to gain some context or this page to gain access to the best resources to study Machine Learning. If you by any chance stumbled on to this article looking for resources for your AI or Machine Learning Project, you’re in luck as there are specialists available that cater to such endeavors.
The first deep learning paper is about CLIP (Contrastive Language–Image Pre-training). An ambitious learning method involving 400 million/text pairs to teach machines image recognition through what is known as Contrastive Learning, a subset of Zero-Shot Self-Supervised Learning. The method involves the closer distancing of correct image/text pairs while showing the machine an incorrect image/text pairs at a further distance which would allow the machine to discern between correct and incorrect image/text pairs using the same image.
The second topic is about diffusion models which allow machines to recognize, correct, enhance, or modify an image by gradually introducing noise or artifacts to the image until it becomes unrecognizable. The AI can learn to recognize and repair the image by backtracking or guessing what the image should be. We see plenty of this on internet ads for apps that perform photo corrections and enhancements. Whether the app uses the same concept or not, it is the general idea.
Another topic discussed was the employment of multi-layer perceptrons (MLPs) into the field of Computer Vision, a field that also allows computers to derive information form images, video, and other visual inputs. Instead of relying on Vision Transformers, MLP Mixers can also help and might even improve the field. MLP Mixers are still new but are soon poised to take over the field as it seems to be the easier choice.
Other topics discussed were GLOM, a Computer Vision model that seeks to improve AI visual scene understanding where a visual scene is split into component parts into a parse tree; Knowledge Distillation which discusses how information can be distributed from a large neural network to smaller ones and lastly an improvement on CNNs or Convolutional Neural Networks by adding a spatial factors or points of observation for images.
Read the entire article in the link above for his down-to-earth views and access the relevant white papers or gain an actual learning experience through this link. But if you’re actually in the lookout for a company that can implement your AI-related project,this is the place you need to go.