vault backup: 2025-02-03 07:04:13

2025-02-03 07:04:14 +01:00
parent e158386068
commit 7909836706
2019 changed files with 59 additions and 26816 deletions
--- a/OneSec/OneSecNotes/30
+++ b/OneSec/OneSecNotes/30
@@ -1,138 +0,0 @@
---
-title: Computer Vision
-created_date: 2024-10-22
-updated_date: 2024-10-22
-aliases:
-tags:
---
-# Computer Vision
-
---
- [ ] 3d reconstruction
- [ ] camera calibration
- [ ] photogrammetry
- [ ] image segmentation
- [ ] facial recognition and eigenfaces
- [ ] image stitching
- [ ] feature recognition
- [ ] connection to [[LLM]]s and [[Multi Modal Models]]
- [ ] [[Convolutional Neural Networks]]
- [ ] [[Deep Learning]]
- [ ] [[Signal Processing]]
- [ ] Vision transformer (VT)
- [ ] Tactile feedback sensors through CV
- [ ] Structured-light 3D scanners
- [ ] thermal cameras
- [ ] radar imaging
- [ ] lidar scanners
- [ ] MRI
- [ ] Sonar
- [ ] 
---
-## Introduction
- Computer Vision acquires, processes, analyzes and understands digital images
- CV works with high dimensional data and extracts useful information from it: It transforms visual information into descriptions of the world, that make sense and can lead to appropriate decision making and action.
- Many subdomains are known
-	- Object detection and recognition
-	- Event detection
-	- 3D pose estimation
-	- motion estimation
-	- image restoration
- Definition:
-> Computer vision is a field of AI that enables computers to interpret, understand and analyze visual data from images or videos, simulating human vision. It involves tasks like object detection, image classification, and facial recognition, with applications in areas like autonomous vehicles and medical imaging.
-
-### Distinctions
- [[Image Processing]] focuses on 2D images and how to transform an image into another image. Therefore, the input and output of image processing is an image. Thus, Image processing does not interpret nor requires assumptions about the image content
- [[Machine Vision]] focuses on image based automation of inspection, process control, robot guidance in industrial applications. Often, image sensor technologies and [[control theory]] are closely intertwined with machine vision. Often there is interaction with the world, e.g. the lighting can be altered, etc.
- [[Imaging]] focuses primarily on producing images and sometimes also interpreting them. E.g. [[medical imaging]] focuses on producing medical images and detecting diseases through them.
-### Foundational Techniques
- Edge detection
- line labelling
- non-polyhedral and polyhedral modelling
- optical flow
- motion estimation
- [[Divide and Conquer]] strategies: run CV algorithms on interesting sub ROI instead of the entire image.
-
-### Applications and Tasks
- Automate inspection
- Identification tasks: e.g. species id
- Controlling processes: e.g. robot
- Detecting events: surveillance, counting, etc
- Monitoring: health, disease, state of object, color graduation, etc.
- modeling objects
- navigation
- organisation of information: indexing existing photos
- tracking of objects, surfaces, edges
- tactile feedback sensor: put a silicone dome with known elastic properties over a camera. On the inside are markers. When the silicone done touches something the markers move and thus a model can calculate forces and interaction with the object.
-
-
---
-## Recognition
- Object recognition: predefined objects that can be identified but not differentiated
- Identification: specific objects are detected and individually tracked: two different people can be differentiated.
- Detection: Object detection together with location: [[Obstacle Detection]] for robots. 
-
-[[Convolutional Neural Networks |CNN]]s are currently the state of the art algorithms for object detection in images. They are nearly as good as humans (only very thin objects don't work well), and even better as humans in subcategories (such as breeds of dogs or species of birds).
-
-### Specialized Tasks based on recognition
- Content-based image retrieval: give me all images with multiple dogs in them
- Pose estimation: estimate the pose of an object relative to the camera: e.g. robot arm, human pose, obstacle, etc.
- [[Optical Character Recognition]]: identify characters in images. Is used by many phones and even obsidian nowadays. QR-codes represent a similar task
- [[Facial Recognition]]: matching of faces
- Emotion recognition
- Shape Recognition Technology (SRT)
- (Human) Activity Recognition
-
-## Motion Analysis
-Using image sequences to produce an estimate of the velocity of an object, allows to track objects (or the camera itself).
- Egomotion: tracking the rigid 3D-motion of the camera
- Tracking: follow the movements of objects in the frames (humans, cars, obstacles)
- [[Optical Flow]]: determine how each point is moving relative to the image plane: combines the movement of the goal point as well as the camera movement. Can be used to do state estimation of a [[Drone]] for example.
-## Others
- Scene reconstruction: the goal is to compute a 3D-Model of a scene from images.
- Image restoration: 
-
-
---
-## Courses
-### Udacity
-The course about [computer vision](https://www.udacity.com/course/computer-vision-nanodegree--nd891). 2 Week free trial.
-1. Image Representation and Classification: numeric representation of images, color masking, binary classification
-2. Convolutional Filters and Edge Detection: frequency in images, image filters for detecting edges and shapes in images, use opencv for face detection
-3. Types of Features & Image Segmentation: corner detector, k-means clustering for segmenting an image into unique parts
-4. feature vectors: describe objects and images using feature vectors
-5. CNN layers and feature visualization: define and train your own CNN for clothing recognition, use feature visualization techniques to see what a network has learned
-6. Project: Facial Keypoint detection: create CNN for facial keypoint (eyes, mouth, nose, etc.) detection
-7. Cloud Computing with AWS: train networks on amazon's GPUs
-8. Advanced CNN architectures: region based CNNs, Faster R-CNN --> fast localized object recognition in images
-9. YOLO: multi object detection model
-10. RNN's: incorporate memory into deep learning model using recurrent neural networks. How do they learn from and generate ordered sequences of data
-11. Long Short-Term Memory Networks (LSTMs): dive into architecture and benefits of preserving long term memory
-12. Hyperparameters: what hyperparameters are used in deep learning?
-13. Attention Mechanisms: Attention models: how do they work?
-14. Image Captioning: combine CNN and RNN to build automatic image captioning model
-15. Project: Image Captioning Model: predict captions for a given image: implement an effective RNN decoder for a CNN encoder
-16. Motion: mathematical representation of motion, introduction of optical flow
-17. Robot Localization: Bayesian filter, uncertainty in robot motion
-18. Mini-Project: 2D Histogram filter: sense and move functions a 2D histogram filter
-19. Kalman Filters: intuition behind kalman filter, vehicle tracking algorithm, one-dimensional tracker implementation
-20. State and Motion: represent state of a car in vector, that can be modified using Linear Algebra
-21. Matrices and Transformation of State: LinAlg: learn matrix operations for multidimensional Kalman Filters
-22. SLAM: SLAM implementation autonomous vehicle and create map of landmarks
-23. Vehicle Motion and Calculus
-24. Project: Landmark Detection & Tracking: implement SLAM using probability, motion models and linalg
-25. Apply Deep Learning Models: Style transfer using pre-trained models that others have provided on github
-26. Feedforward and Backpropagation: introduction to neural networks feedforward pass and backpropagation
-27. Training Neural Networks: techniques to improve training
-28. Deep Learning with Pytorch: build deep learning models with pytorch
-29. Deep learning for Cancer detection: CNN detects skin cancer
-30. Sentiment Analysis: CNN for sentiment analysis
-31. Fully-convolutional neural networks: classify every pixel in an image
-32. C++ programming: getting started
-33. C++: vectors
-34. C++: local compilation
-35. C++: OOP
-36. Python and C++ Speed
-37. C++ Intro into Optimization
-38. C++ Optimization Practice
-39. Project: Optimize Histogram Filter