How to train an image recognition AI

Question

I'm working to create a sort of Operating Manual for a specific large building. There are a lot of pieces of equipment from various manufacturers. My idea is to have the maintenance person snap a photo of a piece of equipment, have AI recognize the model/manufacturer, and provide a quick link to that equipment's repair manual and parts list. Any guidance on where to start would be much appreciated.

thedevindevops · Accepted Answer

You probably don't want to hear this but that problem doesn&rsquo;t require AI to solve. We've had asset tags and QR codes for decades and won't require re-training if a new equipment manufacturer comes into play.

benjbrooks · Answer

Disclaimer: I&rsquo;m one of the foundersWe&rsquo;re working on text/image prompted vision models at DirectAI (https://directai.io). We help clients do detection of bespoke objects by defining them with a single phrase or prompt image. I&rsquo;d be happy to talk about helping you build out a system like this without having to collect large amounts of data, annotate it, and then train a bunch of custom models.

bjacobt · Answer

I don&rsquo;t do AI professionally but as a hobby, so this may not be the best way. But the way you described, it seems the user maybe taking the picture a bit further away and there may be other objects in the frame. So you may want to look into some sort of segmentation or have bounding box. This could help the user make sure they are looking at documents for the correct machine.I think something like detectron2 [1] could help. It is Apache2 license, so commercial friendly. That said the pre-trained weights may not be commercial friendly, so you&rsquo;ll want to check on that.Also fast.ai course [2] is a good starting point to understand the basics. If you are pressed for time, just go through Lesson 1.[1] https://github.com/facebookresearch/detectron2[2] https://course.fast.ai/Edit: added fast.ai, grammar

mpeg · Answer

I&rsquo;ve had good results in the past by fine tuning YOLO for image classification and object detection within images, you can find a rough guide here on how to create a good training dataset and such [0] [1]YMMV though, ultimately accuracy is going to depend on the quality of the labelled data and your use caseThere might be models better suited to your specific needs too, but ultimately you&rsquo;re always going to need the training dataset[0]: https://labelstud.io/blog/getting-started-with-image-classif...[1]: https://docs.ultralytics.com/tasks/classify/

ksherlock · Answer

You start by taking lot of pictures of all your equipment and writing down what the model/manufacturer is for each. Then you train on some of the pictures and test on some of the pictures.
Xcode has drag-n-drop support if you're an iOS person. Otherwise you could use, eg, TensorFlow
https://developer.apple.com/documentation/createml/creating-...
https://www.tensorflow.org/tutorials/images/classification

piezoelectric · Answer

I'm not a professional at A.I but i highly recommend using yolov5, It's available on github and there's plenty of documentation and YT videos on It You can clone it, install the requirments and have fun, you can even use live camera!

Vox_Leone · Answer

Training an image recognition AI involves several steps and requires a combination of data collection, preprocessing, model selection, training, evaluation, and fine-tuning. Here's a high-level overview of the process:
1. Data Collection: Gather a large dataset of images relevant to the task you want the AI to perform [recognize pieces of equipment from various manufacturers]. The dataset should cover a wide range of variations and scenarios that the AI might encounter in real-world situations.
2. Data Preprocessing: Clean and preprocess the images to ensure uniformity and remove noise. This might involve resizing, cropping, normalizing pixel values, and augmenting the dataset with techniques like rotation, flipping, and adding noise to increase variability.
3. Model Selection: Choose a suitable deep learning architecture for image recognition, such as Convolutional Neural Networks (CNNs), which are highly effective for this task. Popular pre-trained models like VGG, ResNet, Inception, and MobileNet are often used as starting points.
4. Training: Split your dataset into training, validation, and test sets. Train the selected model on the training data using techniques like stochastic gradient descent (SGD) or Adam optimization. During training, the model learns to map input images to their corresponding labels or classes.
5. Evaluation: Assess the performance of the trained model using the validation set. Common evaluation metrics for image recognition tasks include accuracy, precision, recall, and F1-score. Adjust hyperparameters and model architecture based on the validation results to improve performance.
6. Fine-Tuning: Fine-tune the model by adjusting its parameters or using techniques like transfer learning. Transfer learning involves leveraging pre-trained models trained on large-scale datasets (e.g., ImageNet) and fine-tuning them on your specific dataset to achieve better performance with less training data.
7. Testing: Once satisfied with the model's performance on the validation set, evaluate it on the test set to assess its generalization ability to unseen data. This step helps ensure that the model performs well in real-world scenarios.
8. Deployment: Deploy the trained model in production environments, whether it's on edge devices, servers, or cloud platforms, depending on your application requirements. Implement mechanisms for model monitoring and updates to maintain performance over time.
Throughout the process, it's essential to iterate and refine each step based on insights gained from experimentation and evaluation. Additionally, staying updated with the latest research and techniques in the field of computer vision can help you improve the performance of your image recognition AI.