Deep learning - Q-engineering
Q-engineering
Q-engineering
Go to content
images/empty-GT_imagea-1-.png
Caffe deep learning

We do deep learning

As you can see all over our website, we are deep learning craftsmen.

What is deep learning?

 
Deep learning is a modern computer algorithm capable of learning patrons. This can be any kind of patron, from an apple or a handwritten character to a chess game strategy. No wonder it is often used in computer vision.

Don't be mistaken, internal a deep learning network works with analogue numbers. It is not a calculator nor a digital computer. Although you need a powerful computer to run these algorithms. At the output, you have still analogue number representing the validity probability. That is why some people call deep learning 'Statistic 2.0'.

Neural networks have been around for quite some time. In the late 1940s started the pioneer O.D. Hebb, a Canadian neuropsychologist, with computer simulation of neurons. Hebbian learning was one of the first training algorithms in the field.

One important precursor to the modern deep learning topologies is the neocognitron by Kunihiko Fukushima used for handwritten characters in 1980 (paper, source code). The general topology has many similarities with the modern layout; every layer in the network increases the complexity of the recognized feature.

Neocognitron
Schematic overview neocognitron

The re-introduction of backpropagation rule in 1985 by Rumelhart is another milestone. Now relative large networks could be trained with a simple chain rule. Before that moment neurons lack the sigmoid output necessary for gradient-based optimization.

In 2012 the AlexNet was introduced. Before that moment, designers had to painfully handcraft the features from the input before the network could even begin to recognize. The AlexNet, on the contrary, is capable of selecting and moulding the needed features itself. Together with the massive use of GPU computational power, it is rightly the start of a new phase in machine learning.


AlexNet
AlexNet. Two streams are drawn because two GPUs were used to implement the network. Nevertheless, it took five days to train the network.

A few last words about deep learning. First, as you can see in the diagram of the AlexNet, the input scene is not very large. Most deep learning networks take images of 128x128 or 224x224 pixels as input. This has all to do with the computational complexity of deep learning.

Garbage in, garbage out. In other words, the possible outcome is as good as your (many) training samples. The more vague or doubtful input samples, the more difficult the network will train, or not at all. A golden rule; if we humans can't see the difference, your neural network won't be able to.

You can not manipulate the training. There are a few global parameters like learning or adaption rate, but how the network will differentiate between an apple and an egg (shape, colour, size, stamp) is not in your hands. That can be a burden.

And at last, don't overvalue deep learning. Deep learning is used everywhere these days. Even in situations where some simple good old fashion vision algorithm could do the job wonderful.

Layers in network
Increasing complex features in the layers of a neural network. Image from Stanford's CS 231N course

Gift shop.


Due to NDA's, we can not give any show cases here. Instead, we have opened a gift shop. Here you can download several deep learning networks for your own use. NCNN is our fork from the original ncnn of Chinese internet giant Tencent. It is a framework specially designed for ARM processors as found in smartphones, the Raspberry Pi and its alternatives. On top of this, due to the Vulkan interface, it also supports AMD graphics cards. It takes Caffe networks as input. The used Caffe framework is the weiliu89 fork of Caffe. Wei Liu, being one of the designer of the SSD (single shot decision), has modified the Caffe architecture with some new layers. Installation of the software is identical to the original Berkeley version. It all speaks for itself. The name of the network is also the link to the original paper.
ModelTop1
mAP
Rpi ncnn
mSec
RX580
mSec
NCNN
Caffe
TensorFlowKerasPyTorch
224x224
ImageNet 2012
69.4 (1x)
60.3 (0.5x)
22
224x224
ImageNet 2012
67.4 (1x)
56.5 (0.5x)



FarmingYardScheckMediaJaxony
227x227
ImageNet 2012
60.485
Q-engineeringDeepScale
Songhan
Tandon-ARcmalliTorchVision
352x352
VOC2007
70.7335



304x304
VOC2007
71300
eric612nnUyikoshian2
600x600 (!)
VOC2007
7326042
you359jwyang
224x224
VOC2007+VOC2012
771515


s

224x224
ImageNet 2012
82.6 (B4)
76.3 (B0)


Googlequbvellukemelas
331x331
ImageNet 2012
82.8




DIY deep learning on Raspberry Pi
Raspberry and alt
Info
Raspberry Pi 4
Back to content