Deep learning examples on Raspberry 32/64 OS
On this page, we focus on the software. Ultimately, you should be able to perform the examples we present here yourself. They all written in the C language, one of the fastest computer languages. All use only a Raspberry Pi. No additional hardware is required. However, you can easily include the Raspicam in the applications. We start with an overview. Later, some specific software details will be discussed.
The overview speaks for itself. The highest frame rate measured comes from a Raspberry 64-bit OS overclocked to 1950 MHz. The lowest is the standard 32-bit Raspbian at 1500 MHz. Frame rates are only based on model run time (interpreter->Invoke()). Grabbing and preprocessing of an image are not taken into account, nor plotting output boxes or texts. This way, models working with pictures instead of video streams can also be measured. The actual frame rate will be slightly lower when playing video's or using a camera.
This application detects multiple objects in a scene. The most commonly used models are the SSD (Single Shot Detection) and YOLO (You Only Looks Once). We have some examples on GitHub of the YOLO version, but here the TensorFlow Lite SSD is explored as being one of the fastest. The COCO SSD MobileNet v1 recognizes 80 different objects. It can detect up to ten objects in a single scene. Note also, the 64-bit version is suitable for both the Raspberry 64-OS as for Ubuntu 18.04 or 20.04.
This application tries to detect the outline of multiple objects in a scene. It is done by with so-called semantic segmentation: a neural network attempts to associate every pixel in a picture with a particular subject. Tensorflow Lite has one segmentation model capable of classifying 20 different objects. Keep in mind that only reasonable sized objects can be recognized, not a scene of a highway with lots of tiny cars. Just like the previous example, the 64-bit version is suitable for both the Raspberry 64-OS as for Ubuntu 18.04 or 20.04.
This application estimates a person's pose in a scene. The deep learning model used recognizes elbows, knees, ankles, etc. TensorFlow Lite supports two models, a single and a multi-person version. We have only used the single-person model because it gives good results when the person is centred and in full view in a square-like image. The multi-pose variant lacks the robustness in our tests. Only during optimal conditions, this model came with an acceptable outcome. As usual, the 64-bit version is suitable for both the Raspberry 64-OS as for Ubuntu 18.04 or 20.04.
The most well known is, of course, the classifications of objects. Google hosts a wide range of TensorFlow Lite models, the so-called quantized models in their zoo. The models are capable of detecting 1000 different objects. Square images have been used to train the models. Therefore, the best results are with a square-like input image. Our C++ example supports all TensorFlow Lite models from the zoo, as you shall see. Now also, the 64-bit version is suitable for both the Raspberry 64-OS as for Ubuntu 18.04 or 20.04.
Face mask detection.
This application detects if a person is wearing a face mask or not. Two phases are involved in this. First, all the faces are recognized and marked in the picture. Next, every face will be examined by a second deep learning network, which will predict if the person is wearing a mask.
The face recognition software is form Linzaer (linzai). The detection of the face mask is from Baidu. Both are in the public domain. You have to install two frameworks on the Raspberry Pi: ncnn and Paddle Lite. Unfortunately, Paddle Lite can not be installed on a Raspberry Pi with a 32-bit operating system, only on a 64-bit OS, or on a Raspberry Pi 3. Ubuntu 18.04 or 20.04 on a Raspberry Pi 4 is also a possibility. The frame rate depends on the number of detected faces and can be calculated as follows: FPS = 1.0/(0.04 + 0.011 x #Faces), when overclocked to 1950 MHz.
This application detects faces in a video stream. It is fast, over the 80 FPS on a bare Raspberry Pi 4 with a 64-bit OS.
We have examples of three frameworks. Installation of the MNN or ncnn is necessary before running the app. The OpenCV app, on the other hand, has no other dependencies. As you can see in the graph below, the MNN 8 bit quantized models are very fast.
If TensorFlow Lite segmentation was still a simple model, Yolact is of a completely different order. It is complex and very precise. Even objects of the same class are distinguished from each other, as can be seen from the different zebras in the picture. The input dimension is large (550x550 pixels). The model distinguishes 80 different classes. All this comes with a price; it is not fast on a bare Raspberry Pi 4.
Time to check your operating system if your are not sure. Run the command uname -a and verify your version with the screen dump below.
You have to check your compiler version also with the command gcc -v. It must also be an aarch64-linux-gnu version, as shown in the screenshot. If you have a 64-bit operating system, but your gcc version is different from the one given above, reinstall the whole operating system with the latest version. The guide is found here: Install 64 bit OS on Raspberry Pi 4.
The 32-bit version gives the output shown below.
First, you need a good IDE to write your C ++ program. You could use Geany as it comes with the Raspbian OS. However, Geany cannot handle projects, only individual files. You end up messing with Make to integrate all the different files into one executable. Secondly, Geany has limited debug tools.
We are going to use Code::Blocks. The IDE can handle multi-file projects and has excellent debug functions such as variable, thread or CPU registry inspection. The IDE is relatively easy and intuitive to understand. With the following command in your terminal, you can install Code::Blocks.
$ sudo apt-get install codeblocks
Working with Code::Blocks involves the same steps. All our GitHub projects have project files (* .cbp). Once loaded in Code::Blocks, all environment options are automatically set correctly.
If you want to start with an empty project, please follow this procedure. First, you load your source code into the IDE. Second, you have to give the folder where the necessary headers are. Do this with the menu option Project → Build options.
Select tab sheet Seach directories and under Compiler give the locations where the used headers are. Please note, do not select the Debug or Release option, but use instead of the project name as can be seen below.
The next step is to specify the libraries used and the linker flags. Again, use menu option Project → Build options, but now select tab sheet Linker settings. Here the linker settings are shown as used in the Face detection app.
Just like the headers, you must give the location of the used libraries now. Do this in tab sheet Search directories, tab Linker.
You are ready with most settings. However, some applications may require some additional command line parameters during startup. Give these in menu option Project → Set programs' arguments. Select your target (Debug or Release) and given the arguments just like you would on the command line. Below some example.
Instead of using the GUI, you can modify all settings in the Code::Blocks project file (*.cbp). The file is in XML format and is very readable. Before you make any modification, close the project in Code::Blocks first. This way any change you made will be loaded in Code::Blocks when you reopen your project.
Most deep learning C++ examples work with a video stream. We often provide an mp4 movie how illustrates the functionality of the app. However, if you want to use a camera in your application, it can be done by just altering one line of code. Look for the line where the VideoCapture is declared. Here, modify the input by replacing the name of the movie by a number. Below you see a code snippet.