Assistive Technologies based on Image Processing for People with Visual Impairments

Automatic face detection / recognition as a tool to help social interaction of the blind

What we are doing

The first months of the project have been devoted to an accurate definition of the problem, the analysis of the state of the art, the definition of the system specifications and the design of the architecture of the system.
Papers 1 and 2, describing the architecture and the issues related to preprocessing and image analysis, have been presented in two international conferences.

In the last months of 2015 we generated a database of video sequences, both indoor and outdoor, acquired by blind people using a camera either mounted on eyeglasses or on a necklace.

These sequences have been used to define which are the specific characteristics of the images our system will have to deal with; in particular, problems related to bad illumination conditions, image distorsion, and large movements will have to be carefully taken into account.

The first step in our processing chain is face detection. In 2016, after an analysis of the state of the art in face detection algorithms, the most promising algorithms were selected are tested in order to see which of these better suit our needs.

The above picture shows two images taken with the eyeglass-mounted (the top one) and the necklace-mounted (the bottom one) cameras in our cafeteria. The red circles show the faces detected by the PICO algorithm. It may be seen that no face is found in the top image, probably due to the large distortion in its peripheral areas. Correctly, two faces are in turn found in the bottom image.

Thanks to Prof. Herbert Frey, on leave from the University of Ulm (Germany), a complete VisualC face detection tool was created, able also to perform some cropping and preprocessing of the detected faces. Its output can feed a face recognition algorithm.
In parallel, an analysis of possible real-time implementations of the system is being carried out. A possible solution is to connect the camera to a smartphone, where the software is implemented in Android; we are also considering the pros and cons of embedded systems, either on single board computers such as Raspberry Pi, or systems based on FPGAs, GPUs or DSPs.

The approach being followed in this project, which aims at including the potential users of the system in all the phases of its design and experimentation, was also presented in papers 3 and 6.

First tests of the face detection and recognition tools, partially presented in papers 4 and 5, showed that the large variability and the generally poor quality of the acquired video, and the necessity of providing the user with a response having an extremely small false recognition rate, make the results unsatisfactory. It was then decided to resort to Deep Learning methods, that have recently provided very successful results in the general field of computer vision.
From December 2016 to June 2017 Dr. Jhilik Bhattacharya, on leave from Thapar University, Punjab (India), has been working in our Lab on these methods. She will continue to cooperate from her home institution after the end of her sabbatical semester.
In June 2017 a Users Group formed by several persons with visual disabilities and experts in the field has been created, as a branch of a "Living Lab" founded by prof. Ilaria Garofolo and her coworkers at the University of Trieste, to improve User Experience (UX) aspects of the project.
A deep learning - based method for face identification was presented at a workshop in Rome in Sept. 2017. A prototype of a portable device (see photograph) was built and some first experiments were performed.

In March 2018 the project was discussed during a National meeting of the Italian association for the blind and visually impaired.