VLC based media player - problem with hanging after n loops of playback

Hi there,

My team and I are working on a VLC based Raspberry Pi 4 media player. We’re using the VLC bindings for Python module to download/cache a playlist of 1080p movies, and post playback stats to a RabbitMQ message broker.

We were hoping to use Balena to deploy hundreds of these players but have run into a problem.

Running our software on a Raspbian Buster with desktop image, it’s able to run solidly for a week+.

Running the identical software on a BalenaOS balenalib/raspberrypi3:buster image works for n loops of the playlist before hanging at the end of a video, and providing no error logs, warnings, or anything that we’ve been able to find.

It fails in this state:

  • The Balena Cloud Reboot/Restart buttons don’t reboot or restart the device.
  • We can still interact with the Host and app container running the media player via the Balena Cloud Terminal, but free -m reports 288MB free ram, and top shows nothing using any significant CPU.
  • Attempting to restart the media player container via the Balena Cloud Services tab results in the Host reporting it has Killed the service but it doesn’t ever restart it.
  • Checking the supervisor logs journalctl -f -a -u resin-supervisor shows that it thinks it’s healthy.

Questions

  1. Have you noticed this behaviour before in other Balena apps?
  2. Do you have any ideas how we can debug this?
  3. Could you check over our Dockerfile.template & start script for obvious errors?
  4. Are there any Device Variables we should be setting differently?

If anyone has any ideas how to debug further that’d be great!

Cheers, Simon.

Technical details

Device Configuration

  • Define device GPU memory in megabytes: 512
  • Define DT parameters: “i2c_arm=on”,“spi=on”,“audio=on”
  • RESIN_HOST_CONFIG_arm_64bit: 1
  • RESIN_HOST_CONFIG_avoid_warnings: 1
  • RESIN_HOST_CONFIG_dtoverlay: “vc4-fkms-v3d”

Dockerfile.template

# Force Raspberry Pi 3 for 32-bit X
FROM balenalib/raspberrypi3:buster

# Use `install_packages` for dependencies
RUN install_packages vlc vlc-plugin-* g++ python3-pip python3-setuptools python3-dev build-essential \
  xserver-xorg-core \
  xinit lxsession desktop-file-utils \
  raspberrypi-ui-mods rpd-icons \
  gtk2-engines-clearlookspix \
  matchbox-keyboard \
  # For system volume
  libasound2-dev \
  # Audio
  alsa-utils \
  # Remove cursor
  unclutter

# disable lxpolkit popup warning
RUN mv /usr/bin/lxpolkit /usr/bin/lxpolkit.bak

# Set wallpaper
COPY /conf/desktop-items-0.conf /root/.config/pcmanfm/LXDE-pi/

# Autohide desktop panel
COPY /conf/panel /root/.config/lxpanel/LXDE-pi/panels/

# Hide desktop panel completely
COPY /conf/autostart /etc/xdg/lxsession/LXDE-pi/
COPY /conf/autostart /root/.config/lxsession/LXDE-pi/

# Disable screen from turning it off
RUN echo "#!/bin/bash" > /etc/X11/xinit/xserverrc \
  && echo "" >> /etc/X11/xinit/xserverrc \
  && echo 'exec /usr/bin/X -s 0 dpms -nolisten tcp "$@"' >> /etc/X11/xinit/xserverrc

# Enable udevd so that plugged dynamic hardware devices show up in our container.
ENV UDEV 1

# Install Python modules
COPY ./requirements/base.txt /code/requirements/base.txt
COPY ./requirements/prod.txt /code/requirements/prod.txt
RUN pip3 install -Ur /code/requirements/prod.txt

COPY . /code/
WORKDIR /code/

# pi.sh will run when the container starts up on the device
CMD ["bash","scripts/pi.sh"]

Start-up script

#!/bin/bash

# Allow VLC to run as root
sed -i 's/geteuid/getppid/' /usr/bin/vlc

# Remove the X server lock file so ours starts cleanly
rm /tmp/.X0-lock &>/dev/null || true

# Set the display to use
export DISPLAY=:0

# Set the DBUS address for sending around system messages
export DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket

# Start the desktop manager
echo "STARTING X"
startx -- -nocursor &

# TODO: work out how to detect X has started
sleep 5

# Hide the cursor
unclutter -display :0 -idle 0.1 &

# Start the VLC media player
python3 media_player.py

Media Player

The core Python code looks something like this:

import vlc

class MediaPlayer():
    """
    A media player that communicates with our CMS to download resources
    and update the message broker with its playback status.
    """

    def __init__(self):
        self.playlist = []
        self.vlc = {
            'instance': None,
            'player': None,
            'list_player': None,
            'playlist': None,
        }
        self.init_vlc()

    def init_vlc(self):
        """
        Initialise the VLC variables:
            - The VLC instance
            - A MediaListPlayer for playing playlists
            - A MediaPlayer for controlling playback
            - A MediaList to load in the MediaListPlayer
        Documentation for these can be found here:
            http://www.olivieraubert.net/vlc/python-ctypes/doc/
        """
        flags = ['--quiet']
        self.vlc['instance'] = vlc.Instance(flags)
        self.vlc['list_player'] = self.vlc['instance'].media_list_player_new()
        self.vlc['player'] = self.vlc['list_player'].get_media_player()
        self.vlc['playlist'] = self.vlc['instance'].media_list_new()
        self.vlc['player'].set_fullscreen(True)
        self.vlc['list_player'].set_playback_mode(vlc.PlaybackMode.loop)


if __name__ == "__main__":
    media_player = MediaPlayer()
    media_player.download_playlist()  # Adds media to self.vlc['list_player']
    media_player.vlc['list_player'].play()

In today’s debugging session we ruled out:

  1. Docker - we’re running the same Docker.template in Raspbian and it runs without hanging (in both headless & desktop installs).
  2. Python VLC module - we manually ran VLC with the same playlist of videos in Balena and it hung in the same way.

Today we’re testing:

  1. Running the repo without startx.
  2. Testing a Raspberry Pi 3 (32-bit kernel) with the current repo on BalenaOS.

Will leave these running tonight and report tomorrow.

Results:

  1. Running without startx made no difference.
  2. Running on a Raspberry Pi 3 with BalenaOS the media player plays without hanging/crashing.

So it might be necessary for Balena staff to help debug what the difference is with the BalneaOS 64-bit running on the Raspberry Pi 4.

Question:

  1. Could it be that there’s a small VLC memory leak or growing memory use by BalenaOS? How does BalenaOS handle swap/memory pressure? See attached graphs of our Raspberry Pi 3 32-bit tests.

@sighmon what exactly are you monitoring with grafana? Is the the VLC process or the python process? Also: which exact version of VLC are you using?

@ffissore We’re measuring the Process Resident Memory (MB) of the RPi 3s there - unfortunately it’s not broken down by process, that’s the total. Here’s what the graph looks like today - seems that they manage their memory when it hits a peak, and everything continues fine (the media players on those Pi3s were all running this morning).

VLC version 3.0.8:

# vlc --version
VLC media player 3.0.8 Vetinari (revision 3.0.8-0-gf350b6b5a7)
VLC version 3.0.8 Vetinari (3.0.8-0-gf350b6b5a7)
Compiled by serge on arm-build.pitowers.org (Nov 29 2019 14:32:53)
Compiler: gcc version 8.3.0 (Raspbian 8.3.0-6+rpi1)
This program comes with NO WARRANTY, to the extent permitted by law.
You may redistribute it under the terms of the GNU General Public License;
see the file named COPYING for details.
Written by the VideoLAN team; see the AUTHORS file.

Hey Simon!

I think breaking it down by process would be essential to see what’s going on. VLC is a very stable piece of software, and I doubt that it has an such as noticeable memory leak (no one knows though!) so I suspect that the leak is coming from the process making use of VLC through the Python driver, and breaking it down by process would help.

I’m not sure if you can easily do that on Grafana, but in the worst case, try to go into the device when the memory leak is happening and check the processes running on the device and the memory they are using.

This ended up being a bug that was fixed in balenaOS 2.47.0+rev8. :partying_face:

Thats awesome. Thanks for letting use know!

what exporter are you using? in special for video buffers

@ozelada We’ve got a custom Prometheus exporter.