Insights

A picture says more than 1000 words!

A picture says more than 1000 words!

In‌ ‌recent‌ ‌years,‌ ‌neural‌ ‌networks‌ ‌(deep‌ ‌learning)‌ ‌have‌ ‌achieved‌ ‌many‌ ‌notable‌ ‌successes‌‌ ‌to‌ ‌

apply‌ ‌for‌ ‌recognition.‌ ‌For‌ ‌example,‌ ‌healthcare‌ ‌providers‌ ‌use‌ ‌neural‌ ‌networks‌ ‌to‌ ‌predict‌ ‌
medical‌ ‌diagnoses‌ ‌and‌ ‌industry‌ ‌use‌ ‌them‌ ‌to‌ ‌visually‌ ‌detect‌ ‌defects‌ ‌in‌ ‌manufacturing‌ ‌
materials‌ ‌and‌ ‌finished‌ ‌products.‌ ‌However,‌ ‌the‌ ‌images‌ ‌are‌ ‌almost‌ ‌always‌ ‌flattened‌ ‌and‌ ‌
projected‌ ‌in‌ ‌2D,‌ ‌and‌ ‌therefore,‌ ‌the‌ ‌perception‌ ‌of‌ ‌depth‌ ‌is‌ ‌lost.‌ ‌Fortunately,‌ ‌thanks‌ ‌to‌ ‌LiDAR‌ ‌
sensors,‌ ‌3D‌ ‌data‌ ‌can‌ ‌be‌ ‌made‌ ‌accessible.‌ ‌The‌ ‌use‌ ‌of‌ ‌LiDAR‌ ‌is‌ ‌therefore‌ ‌increasing‌ ‌rapidly.‌ ‌
A‌ ‌recent‌ ‌study‌ ‌by‌ ‌GLOBE‌ ‌NEWSWIRE‌ ‌predicted‌ ‌that‌ ‌the‌ ‌LiDAR‌ ‌market‌ ‌would‌ ‌increase‌ ‌by‌ ‌
22.7%‌ ‌by‌ ‌2026.‌ ‌ ‌

Point‌ ‌Cloud's‌ ‌challenges‌ ‌in‌ ‌Deep‌ ‌Learning‌ ‌‌LiDAR‌ ‌sensors‌ ‌use‌ ‌laser‌ ‌pulses‌ ‌to‌ ‌make‌ ‌
hundreds‌ ‌of‌ ‌thousands‌ ‌of‌ ‌highly‌ ‌accurate‌ ‌measurements‌ ‌per‌ ‌second.‌ ‌Measurements‌ ‌are‌ ‌
converted‌ ‌to‌ ‌points‌ ‌that‌ ‌are‌ ‌spatially‌ ‌defined‌ ‌by‌ ‌X,‌ ‌Y‌ ‌, and‌ ‌Z‌ ‌coordinates.‌ ‌‌Besides‌ ‌the‌ ‌spatial‌ ‌
coordinates,‌ ‌points‌ ‌can‌ ‌also‌ ‌be‌ ‌defined‌ ‌by‌ ‌additional‌ ‌features‌ ‌such‌ ‌as‌ ‌the‌ ‌intensity‌ ‌(I)‌ ‌and‌ ‌
(R,‌ ‌G,‌ ‌B)‌ ‌colors.‌‌ ‌However,‌ ‌Deep‌ ‌Learning‌ ‌on‌ ‌Point‌ ‌Cloud‌ ‌brings‌ ‌challenges‌ ‌because‌ ‌the‌ ‌
data‌ ‌has‌ ‌different‌ ‌properties‌ ‌compared‌ ‌to‌ ‌ordinary‌ ‌2D‌ ‌images.‌ ‌ ‌

This is mainly because a Point Cloud has some very different properties than a flat RGB image: a Point Cloud is unstructured, irregular, and unordered. Typical Deep Learning models for RGB data require the flat structure of the visual XY grid to process the data. For example, RGB pixels can not be arbitrarily reordered (permuted), that would destroy the image. But points in a Point Cloud can be. The shapes of the objects they represent are invariant under such permutations. Only‌ ‌in‌ ‌this‌ ‌way‌ ‌can‌ ‌the‌ ‌architecture‌ ‌deal‌ ‌with‌ ‌the‌ ‌Point‌ ‌Cloud‌ ‌and‌ ‌unordered‌ ‌3D‌ ‌datasets.‌ ‌We‌ ‌say‌ ‌that‌ ‌a‌ ‌neural‌ ‌network‌ ‌‌permutation‌ ‌invariant‌‌ ‌must‌ ‌exist‌ ‌to‌ ‌make‌ ‌predictions‌ ‌possible.‌ ‌

PointNet‌ ‌was‌ ‌released‌ ‌in‌ ‌2017‌ ‌to‌ ‌solve‌ ‌these‌ ‌challenges for classification and segmentation of Point Cloud data. ‌This‌ ‌technology‌ ‌offers‌ ‌a‌ ‌uniform‌ ‌architecture‌ ‌that‌ ‌can‌ ‌directly‌ ‌process‌ ‌the‌ ‌Point‌ ‌Clouds‌ ‌
datasets‌ ‌and‌ ‌learn‌ ‌to‌ ‌classify‌ ‌them.‌ ‌It‌ ‌is‌ ‌also‌ ‌possible‌ ‌to‌ ‌process‌ ‌all‌ ‌input‌ ‌data‌ ‌at‌ ‌once,‌ ‌or‌ ‌
determine‌ ‌your‌ ‌input‌ ‌per‌ ‌point‌ ‌segment.‌ ‌This‌ ‌makes‌ ‌the‌ ‌architecture‌ robust ‌for‌ ‌ ‌
Permutations in the data.‌ ‌In‌ ‌addition,‌ ‌it‌ ‌guarantees‌ ‌robustness‌ ‌to‌ ‌data‌ ‌changes‌ ‌such‌ ‌as‌ ‌rotation.‌ ‌Finally,‌ ‌the‌ ‌technology‌ ‌also‌ ‌serves‌ ‌as‌ ‌a‌ ‌backbone,‌ ‌collecting‌ ‌information‌ ‌from‌ ‌each‌ ‌point‌ ‌and‌ ‌
converting‌ ‌the‌ ‌input‌ ‌into‌ ‌a‌ ‌higher‌ ‌dimensional‌ ‌vector.‌ ‌Thanks‌ ‌to‌ ‌PointNet,‌ ‌systems‌ ‌can‌ ‌be‌ ‌
developed‌ ‌that‌ ‌can‌ ‌extract‌ ‌information‌ ‌from‌ ‌3D‌ ‌images‌ ‌and‌ ‌recognize‌ ‌it,‌ ‌understand‌ ‌it,‌ ‌and‌ ‌
interpret‌ ‌it‌ ‌substantively.‌ ‌

A‌ ‌picture‌ ‌is‌ more than ‌1,000‌ ‌words

Computer‌ ‌Vision‌ ‌has‌ ‌grown‌ ‌enormously‌ ‌within‌ ‌the‌ ‌AI‌ ‌​​community‌ ‌, thanks‌ ‌to‌ ‌PointNet.‌ ‌We‌ ‌
increasingly‌ ‌see‌ ‌new‌ ‌AI‌ ‌solutions‌ ‌based‌ ‌on‌ ‌3D‌ ‌data.‌ ‌Construction‌ ‌companies,‌ ‌in‌ ‌particular,‌ ‌
have‌ ‌opted‌ ‌for‌ ‌Point‌ ‌Cloud‌ ‌technology.‌ ‌For‌ ‌example,‌ ‌3D‌ ‌technologies‌ ‌are used‌ ‌for‌ ‌drone‌ ‌
scans,‌ ‌eliminating‌ ‌the‌ ‌need‌ ‌for‌ ‌people‌ ‌on-site‌ ‌to‌ ‌take‌ ‌measurements.‌ ‌In‌ ‌addition,‌ ‌they‌ ‌can‌ ‌
also,‌ ‌use‌ ‌3D‌ ‌for‌ ‌other‌‌ visual inspection purposes. Think‌ ‌of‌ ‌automated‌ ‌quality‌ ‌control‌ ‌by‌ ‌
digital‌ ‌inspectors‌ ‌, so‌ ‌that‌ ‌maintenance‌ ‌employees‌ ‌carry‌ ‌out‌ ‌fewer‌ ‌inspection‌ ‌rounds.‌ ‌For‌ ‌
example,‌ ‌a‌ ‌solution‌ ‌that‌ ‌can‌ ‌inspect‌ ‌road‌ ‌surfaces‌ ‌and‌ ‌‌ automatically ‌‌detect‌ ‌defects‌ ‌from‌ ‌
camera‌ ‌images.‌ ‌Thanks‌ ‌to‌ ‌new‌ ‌technology‌ ‌like‌ ‌this, maintenance‌ ‌companies‌ ‌more‌ ‌readily‌ ‌
see‌ ‌which‌ ‌assets,‌ ‌such‌ ‌as‌ ‌lighting,‌ ‌tile‌ ‌floors,‌ ‌smoke‌ ‌detectors,‌ ‌and‌ ‌surveillance‌ ‌cameras,‌ ‌
need‌ ‌maintenance.‌ ‌This‌ ‌enables‌ ‌them‌ ‌to‌ ‌manage‌ ‌assets‌ ‌more‌ ‌efficiently,‌ ‌save‌ ‌costs,‌ ‌and‌ ‌
better‌ ‌identify‌ ‌safety‌ ‌risks.‌ ‌

No wonder there is a growing demand for 3D analytics. Point Clouds are the future of Computer Vision. AI-based solutions can now consume data in its canonical form and interpret observations in 3D. Providing inspectors with extra valuable information will lead to better results and more robust performance. It will, therefore, not be long before more visual tasks currently performed by humans are soon performed by intelligent digital inspectors. It is now essential to think about the impact of Computer Vision on our social and economic structures. If we do this right, the benefits and possibilities are endless. After all, pictures say more than 1,000 words!

Read the full article in dutch here.

Maarten Stol, Principal Scientific Adviser at BrainCreators & Ghailen Ben Achour researcher at BrainCreators.

Are you interested to have more in-depth information about AI and our solutions?

Download our free Ebook!

 

Download the ebook