The JeVois camera is simply the smallest embedded camera capable of running TensorFlow and other AI or computer vision software.
In 2017, the design was selected by ARM as one of ten for new ARM Innovators program worldwide.
The JeVois camera has everything on board for computer vision tasks. It consists of the following hardware:
- 1.3 Mpixel rolling shutter camera.
- Quad-core ARM Cortex-A7 @ 1.35 GHz with support of VFP and NEON instructions.
- Dual-core MALI-400 GPU, support of OpenGL
- 256 Mbyte DDR3 RAM
- USB 2.0 interface
- Serial port (UART)
- Micro-SD card
- Integrated lens
- Integrated fan
- Power: 5V max 3.5 W (0.7 Amp) over USB.
Once connected to a USB port on a PC is identifies itself as a regular webcam. With simple commands, you can enable the embedded vision and deep learning routines.
One minor drawback is the memory size. As known, an ARM CPU and GPU share the same physical memory and 256 Mbyte is not very much in this case.
The ARM quad-core runs on a Linux OS, just like a Raspberry Pi. It can support any software you like. To mention some popular software packages:
- DarkNet (YOLO)
All the C++ source code is in the public domain and is very well documented.
The development of software is somewhat unusual. Because you don't have a display nor keyboard on the JeVois, every piece of information must be sent via USB. You can browse the MicroSD card and at the same time capture video.
There are two possible strategies to write your software for the JeVois.
You can alter existing Python script files on the MicroSD card inside the JeVois with the browser and see if works.
The other way is writing software on a host with all the debug facilities working and transfer the new app to the JeVois.
If you use the same software, ergo OpenCV, C++17 and Python, porting from your host system to the JeVois can be relatively error-free.
Yet, it remains a different way of programming, especially if you are used to debugging tools and IDE's.
The JeVois is capable of many different vision algorithms. It can detect edges, bar-codes, digit recognition, eye tracking, object matching to name some of the modules that come with the original JeVois software. Let's look here at some of the deep learning applications that comes within the original JeVois bios.
Open Images Dataset v4.
First, we need to have some validation images which can be a real bother. Most deep learning frameworks have their images in one large database. For example, the LMDB format in Caffe, from which individual images are difficult to extract. This is where Google Open Images Dataset comes in handy. With more than 90 million annotated single images in 600 classes, this is exactly what we need. On treasure house-GitHub you can find the OIDv4_ToolKit. This application can be used to synthesize your dataset. Install the Linux software with these commands:
$ git clone https://github.com/EscVM/OIDv4_ToolKit.git
$ cd OIDv4_ToolKit
$ sudo pip3 install -r requirements.txt
And collect your images with (classes are exemplary):
$ python3 main.py downloader --classes \
Cat Dog Horse Car Hippopotamus \Eagle Cello Teapot Zebra \Lion Spider Camel Apple Orange \--type_csv train --limit 30 \--noLabels --image_IsGroupOf 0
JeVois has a special TensorFlow module on board. It can execute TensorFlow Lite models. Given the size of the memory, these models need to be relatively small. Something like a VGG16 with its 61 Mbyte will be too large. Better use something like a MobileNet. MobileNet has many flavours. The smallest is the fastest, but also worse when it comes to accuracy. Here MobileNet 128x128 0.5 is employed. It runs at 38.5 FPS with 40% correct on the ImageNet dataset (1000 different categories to be recognized). However, in the meantime, better networks such as MobileNet V2, SqeeuzeNet or ShuffleNet have been developed.
JeVois running a TensorFlow MobileNet
Another framework JeVois supports is Darknet. Darknet is a software application written by Joseph Redmon. It is capable of running neural networks, just like TensorFlow, Caffe, PyTorch. Besides this definition, it also refers to a type of deep learning network designed by the same author. Here a smaller version of Darknet is executed, the so-called Darknet tiny. Its weights only occupy 4 Mbyte of memory and the accuracy is about 58.7%, just above the SqueezeNet. The input size is 224x224 which makes the prediction time longer (± 2 FPS) then running at 128x128 (±5.5 FPS).
JeVois running a Darknet
The last example is JeVois running YOLO. YOLO (You Only Look Once) is a type of neural network that tries to identifies more than one object in a scene. SSD (Single Shot Detection) is another well-known topology. Instead of a single last output, the structure of YOLO consists of a 2D grid of cells, all with an output of a region in the scene. A clustering technique attempts to connect between adjacent cells with approximately the same outcome. The more cells, the smaller the objects can be. Of course, that's takes also more processing time. The front end of the topology can be any type of deep learning network you like. An AlexNet or a MobileNet can be employed. Here the YOLO tiny is shown. The accuracy is around 33% on the COCO data set (80 different categories to recognize).
JeVois running YOLO V3