This is the chapter web page to support the content in Chapter 14 of the book: Exploring Raspberry Pi – Interfacing to the Real World with Embedded Linux. The summary introduction to the chapter is as follows:

In this chapter, peripherals are attached to the RPi so that it can be used for capturing image, video, and audio data using low-level Linux drivers and application programming interfaces (APIs). It describes Linux applications and tools that can be used to stream captured video and audio data to the Internet. Open Source Computer Vision (OpenCV) image processing and computer vision approaches are investigated that enable the Raspberry Pi (RPi) to draw inferences from the information content of the captured image data. Capture and playback of audio streams is described, along with the use of Bluetooth A2DP audio. The chapter also covers some applications of audio on the RPi, including streaming audio, Internet radio, and text-to-speech (TTS).

After completing this chapter, you should hopefully be able to do the following:

  • Capture image and video data on the RPi using the RPi MMAL camera or USB webcams combined with Linux Video4Linux2 drivers and APIs.
  • Use Video4Linux2 utilities to get information from and adjust the properties of video capture devices.
  • Stream video data to the Internet using Linux applications and UDP, multicast, and RTP streams.
  • Use OpenCV to perform basic image processing on the RPi.
  • Use OpenCV to perform a computer vision face-detection task.
  • Utilize the Boost C++ libraries on the RPi.
  • Play audio data on the RPi using HDMI audio and USB audio adapters. The audio data can be raw waveform data or compressed MP3 data from the RPi file system or from Internet radio streams.
  • Record audio data using USB audio adapters or webcams.
  • Stream audio data to the Internet using UDP.
  • Play audio to Bluetooth A2DP audio devices, such as Hi-Fi systems.
  • Use text-to-speech (TTS) approaches to verbalize the text output of commands that are executed on the RPi.

Digital Media Resources

Below are some high-resolution images of the circuits described in the book. They are reproduced in colour and can be printed at high resolution to facilitate you in building the circuits.

The RPi MMAL Camera and its Attachment to the RPi 2/3

Example Processed Images

Digital Media Resources

Video: Video Capture and Image Processing

In this video I look at how you can get started with video capture and image processing on the Beaglebone. It is an introductory video that should give people who are new to this topic a starting point to work from. I look at three different distinct challenges: how do you capture video from a USB webcam under Linux, how do you capture image frames from a USB webcam under Linux, how do you use OpenCV to capture and image process frames so that you can build computer vision applications under Linux on the Beaglebone.

Video: Streaming Video & Custom Video Player

In this video I look at video streaming using the Beaglebone black using: RTP, UDP unicasting, and UDP multicasting, which allows one to many streaming. In all of these examples I used the VLC media player to display the video data. The final part of this video goes on to describe how you can build your own software implementation that can display the data using LibVLC and the Qt framework. The advantage of doing this is that you can add your own data processing and controlling functionality into the video display. You could even develop code for capturing multiple streams simultaneously and processing the data — for example, for stereo imaging.

External Resources

If you wish the expand your knowledge on the topic of Computer Vision on the BeagleBone platform the book “Learning OpenCV: Computer Vision in C++ with the OpenCV Library by Adrian Kaehler and Gary Bradski” is probably the best option. Please ensure that you purchase/pre-order the C++ version of the book, as the non-C++ version was published in 2008 and has dated somewhat. This version is due for release in June 2015.

You can purchase it on Amazon at the following links: (USA) (Canada) (UK) (Germany) (France) (Italy) (Spain)

  • Video4Linux2 core documentation:
  • V4L2 API Specification:
  • Computer Vision Cascaded Classification:
  • CVonline: The Evolving, Distributed, Non‐Proprietary, On‐Line Compendium of Computer Vision, at
  • The Boost C++ Libraries, Boris Schäling:


None for the moment

Recommended Books on the Content in this Chapter