Raspberry Pi 3b and 4 reporting memory allocation errors when working with USB UVC cameras

I have three Raspberry Pi 3b and 4 devices with a libuvc based program reading video frames from two USB UVC cameras. balenaOS 2.38.0+rev1

Occasionally, the kernel logs a page allocation failure in usbdev_do_ioctl. At about the same time my program will stop receiving video frames from one of the cameras.

Suggestions?

[  134.889141] camera_controll: page allocation failure: order:5, mode:0x14040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null)
[  134.889178] camera_controll cpuset=d3d06553257a45a31bb2699524f98800b6b337a59d47a5783b190e1933cc9c97 mems_allowed=0
[  134.889205] CPU: 1 PID: 2084 Comm: camera_controll Tainted: G         C      4.14.98 #1
[  134.889211] Hardware name: Raspberry Pi 3 Model B Rev 1.2 (DT)
[  134.889218] Call trace:
[  134.889238] [<ffffff9b8b28b2f0>] dump_backtrace+0x0/0x258
[  134.889248] [<ffffff9b8b28b56c>] show_stack+0x24/0x30
[  134.889260] [<ffffff9b8ba45918>] dump_stack+0xac/0xe4
[  134.889271] [<ffffff9b8b3befa4>] warn_alloc+0x10c/0x198
[  134.889280] [<ffffff9b8b3bfdbc>] __alloc_pages_nodemask+0xd04/0xd90
[  134.889290] [<ffffff9b8b3e4ce0>] kmalloc_order+0x30/0x58
[  134.889299] [<ffffff9b8b3e4d44>] kmalloc_order_trace+0x3c/0x110
[  134.889311] [<ffffff9b8b413a84>] __kmalloc+0x284/0x2f0
[  134.889325] [<ffffff9b8b7e28a4>] proc_do_submiturb+0xccc/0xde8
[  134.889333] [<ffffff9b8b7e38e4>] usbdev_do_ioctl+0xf24/0x15b0
[  134.889342] [<ffffff9b8b7e3fec>] usbdev_ioctl+0x34/0x48
[  134.889351] [<ffffff9b8b440ffc>] do_vfs_ioctl+0xc4/0x840
[  134.889359] [<ffffff9b8b4417c8>] SyS_ioctl+0x50/0x88
[  134.889367] Exception stack(0xffffff800bdbbec0 to 0xffffff800bdbc000)
[  134.889376] bec0: 000000000000000d 000000008038550a 0000007f84146240 0000007fa6c52b10
[  134.889384] bee0: 0000007f9f8fe4e0 0000007f9f8fe580 0000007fa7eb07a8 0000000000000000
[  134.889393] bf00: 000000000000001d 0000007fa4662000 0000007fa70bb0b8 0000007f841463f8
[  134.889401] bf20: 0000000000000c00 0000000000000020 00032be8f8a18d5d 00000dc563b09730
[  134.889410] bf40: 0000007fa6b5d2c8 0000007fa7082af8 0000007f9ef03364 0000007fa714d000
[  134.889418] bf60: 000000000000000d 000000008038550a 0000007f84146240 0000007fa73e0000
[  134.889426] bf80: 0000007fa191bcf8 0000000000000020 0000007fa1f04764 0000007fa239a458
[  134.889435] bfa0: 0000007fa191bc80 0000007f9f8fe4e0 0000007fa7082be0 0000007f9f8fe4e0
[  134.889443] bfc0: 0000007fa6c52b1c 0000000060000000 000000000000000d 000000000000001d
[  134.889451] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  134.889460] [<ffffff9b8b283ac0>] el0_svc_naked+0x34/0x38
[  134.889467] Mem-Info:
[  134.889482] active_anon:129547 inactive_anon:164 isolated_anon:0
                active_file:13268 inactive_file:9864 isolated_file:0
                unevictable:0 dirty:67 writeback:0 unstable:0
                slab_reclaimable:7302 slab_unreclaimable:9162
                mapped:20693 shmem:301 pagetables:1460 bounce:0
                free:63424 free_pcp:28 free_cma:60448
[  134.889503] Node 0 active_anon:518188kB inactive_anon:656kB active_file:53072kB inactive_file:39456kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:82772kB dirty:268kB writeback:0kB shmem:1204kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[  134.889518] DMA free:253696kB min:8192kB low:10240kB high:12288kB active_anon:517776kB inactive_anon:656kB active_file:52752kB inactive_file:39328kB unevictable:0kB writepending:268kB present:1021952kB managed:990884kB mlocked:0kB kernel_stack:3904kB pagetables:5840kB bounce:0kB free_pcp:108kB local_pcp:0kB free_cma:241792kB
[  134.889524] lowmem_reserve[]: 0 0 0
[  134.889540] DMA: 358*4kB (UMEHC) 135*8kB (UMEHC) 57*16kB (UMEHC) 60*32kB (UMEHC) 26*64kB (UEHC) 10*128kB (UMH) 5*256kB (UHC) 2*512kB (HC) 2*1024kB (EC) 2*2048kB (UC) 58*4096kB (C) = 254304kB
[  134.889612] 23437 total pagecache pages
[  134.889619] 0 pages in swap cache
[  134.889625] Swap cache stats: add 0, delete 0, find 0/0
[  134.889631] Free swap  = 0kB
[  134.889636] Total swap = 0kB
[  134.889644] 255488 pages RAM
[  134.889649] 0 pages HighMem/MovableOnly
[  134.889654] 7767 pages reserved
[  134.889660] 65536 pages cma reserved

I’m going to try logging /proc/buddyinfo and /proc/meminfo to see if there are any interesting patterns.

Found one issue, the libuvc library was allocating 100 16 MB buffers per camera. Thankfully it didn’t actually have to use all of them! I was able to reduce this to 10 16MB buffers per camera.

However there still appears to be a memory leak in my code and/or the system.

Hey Jason, let us know how it goes and how we could help out.

@gelbal

Here is a bit more numbers. The device was rebooted about 12 hours before I started this test. There are two “work breaks” every day during which I can poke at the devices as long as I get them back to working before the end of the break.

I have one container, with a bash script that runs the main application in a while true; do ... done loop. I got the process resident size (RSS), virtual size (VSZ) (even though these Raspberry Pi’s don’t have swap…?), /proc/buddyinfo/ and the output of free. Then I killed the process and got the same information; restarted the container, then rebooted the device.

My conclusion is that only rebooting the device frees the memory.

The device: 2dba14e0055c1d368f1af97965931379
The command I used to get the information after each manipulation

ps -eo rss,vsz,pid,euser,args:100 --sort %mem | grep `pgrep camera_controll` ; cat /proc/buddyinfo ; free --wide --mega

First the process memory – restarting the service “looks” effective, the VSZ doesn’t drop after only restarting the process:

                       RSS        VSZ        PID USER Command and arguments
start:                 17372   473860       5724 root ./camera_controller
after restart process: 15748   473860      26609 root ./camera_controller
after restart service: 11892   212776        241 root ./camera_controller
after reboot device:   15948   473860        489 root ./camera_controller

This is what free reports. Restarting the process and the service do not result in lots of memory being freed. Only rebooting the device worked:

                              total        used        free      shared     buffers       cache   available
start:                 Mem:     990         420         269          14          25         275         586
after restart process: Mem:     990         421         269          14          25         275         586
after restart service: Mem:     990         415         279          14          26         270         578
after reboot device:   Mem:     990         270         496           1          11         211         686

And /proc/buddyinfo, memory remains fragmented (not lots of larger free sections) until after a reboot:

                       4K pages, chunk count:   2^0    2^1    2^2    2^3    2^4    2^5    2^6    2^7    2^9   2^10   2^11
start:                 Node 0, zone      DMA    767   1408    412    109     30      7      6      4      3      1     57 
after restart process: Node 0, zone      DMA    776   1413    415    107     29      5      5      3      2      0     58 
after restart service: Node 0, zone      DMA    551   1766    591    151     53     22      4      4      2      0     58 
after reboot device:   Node 0, zone      DMA    198     71    126     74     27     25     25      5      6      2    114 

Suggestions? What am I missing? Why would memory not be freed after killing the process or restarting the service? What should I ask the HostOS to determine where the memory might be stuck?

Hi @jason10,

So some things to try to progress the hunt for the missing memory…

If you have access to the hardware, what happens if you physically remove the usb camera. Does the memory free up?

The next step would be to check the kernel driver itself. Its rare, but there could be a memory leak in the driver in the kernel itself. If it uses kmalloc without a kfree call that is run, there will be a leak.

From https://stackoverflow.com/questions/4727813/linux-kernel-memory-management

The kernel will not do any garbage collection for a module. If the module kmalloc s a chunk of memory and doesn’t kfree it before the module is unloaded, that chunk will stay allocated and inaccessible until the next reboot.

Regards
ZubairLK

In the case of usbdev the flags are GFP_KERNEL|__GFP_COMP which means “the kernel wants this memory and mark the pages of the memory as part of a composite.”

And /proc/slabinfo is also useful, here it is from a device (95cd34fb34796fa7b837c9341d2f0ac3) that has run long enough to exhaust it’s memory:

cat /proc/slabinfo
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
fat_inode_cache       92    154    720   22    4 : tunables    0    0    0 : slabdata      7      7      0
fat_cache            306    306     40  102    1 : tunables    0    0    0 : slabdata      3      3      0
ext4_groupinfo_1k   1950   1950    136   30    1 : tunables    0    0    0 : slabdata     65     65      0
PINGv6                 0      0   1152   28    8 : tunables    0    0    0 : slabdata      0      0      0
RAWv6                280    280   1152   28    8 : tunables    0    0    0 : slabdata     10     10      0
UDPv6                300    300   1280   25    8 : tunables    0    0    0 : slabdata     12     12      0
tw_sock_TCPv6         85     85    232   17    1 : tunables    0    0    0 : slabdata      5      5      0
request_sock_TCPv6     52     52    304   26    2 : tunables    0    0    0 : slabdata      2      2      0
TCPv6                 98     98   2304   14    8 : tunables    0    0    0 : slabdata      7      7      0
nf_conntrack_expect      0      0    216   18    1 : tunables    0    0    0 : slabdata      0      0      0
nf_conntrack         276    375    320   25    2 : tunables    0    0    0 : slabdata     15     15      0
sd_ext_cdb           128    128     32  128    1 : tunables    0    0    0 : slabdata      1      1      0
cfq_io_cq              0      0    120   34    1 : tunables    0    0    0 : slabdata      0      0      0
mqueue_inode_cache     72     72    896   18    4 : tunables    0    0    0 : slabdata      4      4      0
au_finfo            1293   1323    192   21    1 : tunables    0    0    0 : slabdata     63     63      0
au_icntnr           1071   1407    768   21    4 : tunables    0    0    0 : slabdata     67     67      0
au_dinfo           11370  12474    192   21    1 : tunables    0    0    0 : slabdata    594    594      0
f2fs_extent_tree       0      0     72   56    1 : tunables    0    0    0 : slabdata      0      0      0
discard_cmd            0      0    112   36    1 : tunables    0    0    0 : slabdata      0      0      0
discard_entry          0      0     88   46    1 : tunables    0    0    0 : slabdata      0      0      0
free_nid               0      0     24  170    1 : tunables    0    0    0 : slabdata      0      0      0
f2fs_inode_cache       0      0   1104   29    8 : tunables    0    0    0 : slabdata      0      0      0
ovl_inode              0      0    664   24    4 : tunables    0    0    0 : slabdata      0      0      0
nfs_direct_cache       0      0    336   24    2 : tunables    0    0    0 : slabdata      0      0      0
nfs_commit_data       23     23    704   23    4 : tunables    0    0    0 : slabdata      1      1      0
nfs_read_data         36     36    896   18    4 : tunables    0    0    0 : slabdata      2      2      0
nfs_inode_cache        0      0   1064   30    8 : tunables    0    0    0 : slabdata      0      0      0
squashfs_inode_cache      0      0    640   25    4 : tunables    0    0    0 : slabdata      0      0      0
jbd2_journal_handle    340    340     48   85    1 : tunables    0    0    0 : slabdata      4      4      0
jbd2_journal_head    986    986    120   34    1 : tunables    0    0    0 : slabdata     29     29      0
jbd2_revoke_table_s    768    768     16  256    1 : tunables    0    0    0 : slabdata      3      3      0
jbd2_revoke_record_s    640    640     32  128    1 : tunables    0    0    0 : slabdata      5      5      0
ext4_inode_cache    4020   7563   1064   30    8 : tunables    0    0    0 : slabdata    253    253      0
ext4_allocation_context    128    128    128   32    1 : tunables    0    0    0 : slabdata      4      4      0
ext4_system_zone     408    408     40  102    1 : tunables    0    0    0 : slabdata      4      4      0
ext4_io_end         2944   2944     64   64    1 : tunables    0    0    0 : slabdata     46     46      0
ext4_extent_status   5938   6528     40  102    1 : tunables    0    0    0 : slabdata     64     64      0
mbcache             1898   2190     56   73    1 : tunables    0    0    0 : slabdata     30     30      0
dnotify_struct         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
posix_timers_cache     68     68    240   17    1 : tunables    0    0    0 : slabdata      4      4      0
rpc_inode_cache        0      0    640   25    4 : tunables    0    0    0 : slabdata      0      0      0
UNIX                1376   1376   1024   16    4 : tunables    0    0    0 : slabdata     86     86      0
ip4-frags             76     76    208   19    1 : tunables    0    0    0 : slabdata      4      4      0
xfrm_dst_cache         0      0    320   25    2 : tunables    0    0    0 : slabdata      0      0      0
xfrm_state             0      0    768   21    4 : tunables    0    0    0 : slabdata      0      0      0
PING                   0      0    960   17    4 : tunables    0    0    0 : slabdata      0      0      0
RAW                  221    221    960   17    4 : tunables    0    0    0 : slabdata     13     13      0
tw_sock_TCP           85     85    232   17    1 : tunables    0    0    0 : slabdata      5      5      0
request_sock_TCP     104    104    304   26    2 : tunables    0    0    0 : slabdata      4      4      0
TCP                  255    255   2112   15    8 : tunables    0    0    0 : slabdata     17     17      0
cachefiles_object_jar      0      0    384   21    2 : tunables    0    0    0 : slabdata      0      0      0
fscache_cookie_jar    160    160    128   32    1 : tunables    0    0    0 : slabdata      5      5      0
dquot                 64     64    256   16    1 : tunables    0    0    0 : slabdata      4      4      0
eventpoll_pwq       2464   2464     72   56    1 : tunables    0    0    0 : slabdata     44     44      0
inotify_inode_mark    969    969     80   51    1 : tunables    0    0    0 : slabdata     19     19      0
request_queue         28     28   2296   14    8 : tunables    0    0    0 : slabdata      2      2      0
blkdev_requests        0      0    304   26    2 : tunables    0    0    0 : slabdata      0      0      0
blkdev_ioc           156    156    104   39    1 : tunables    0    0    0 : slabdata      4      4      0
biovec-max           160    176   4096    8    8 : tunables    0    0    0 : slabdata     22     22      0
biovec-128            64     64   2048   16    8 : tunables    0    0    0 : slabdata      4      4      0
biovec-64             64     64   1024   16    4 : tunables    0    0    0 : slabdata      4      4      0
user_namespace         0      0    464   17    2 : tunables    0    0    0 : slabdata      0      0      0
uid_cache            105    105    192   21    1 : tunables    0    0    0 : slabdata      5      5      0
dmaengine-unmap-2   3667   4224     64   64    1 : tunables    0    0    0 : slabdata     66     66      0
sock_inode_cache    2200   2200    640   25    4 : tunables    0    0    0 : slabdata     88     88      0
skbuff_fclone_cache    496    496    512   16    2 : tunables    0    0    0 : slabdata     31     31      0
skbuff_head_cache   2417   2448    256   16    1 : tunables    0    0    0 : slabdata    153    153      0
file_lock_cache       80     80    200   20    1 : tunables    0    0    0 : slabdata      4      4      0
fsnotify_mark_connector    680    680     24  170    1 : tunables    0    0    0 : slabdata      4      4      0
net_namespace         25     25   6208    5    8 : tunables    0    0    0 : slabdata      5      5      0
shmem_inode_cache   2498   2724    672   24    4 : tunables    0    0    0 : slabdata    115    115      0
taskstats             96     96    328   24    2 : tunables    0    0    0 : slabdata      4      4      0
proc_dir_entry      2646   2646    192   21    1 : tunables    0    0    0 : slabdata    126    126      0
pde_opener          2652   2652     40  102    1 : tunables    0    0    0 : slabdata     26     26      0
proc_inode_cache   23640  35694    656   24    4 : tunables    0    0    0 : slabdata   3522   3522      0
sigqueue             100    100    160   25    1 : tunables    0    0    0 : slabdata      4      4      0
bdev_cache            76     76    832   19    4 : tunables    0    0    0 : slabdata      4      4      0
kernfs_node_cache  20424  20910    136   30    1 : tunables    0    0    0 : slabdata    697    697      0
mnt_cache           1659   1659    384   21    2 : tunables    0    0    0 : slabdata     79     79      0
filp                7777   8032    256   16    1 : tunables    0    0    0 : slabdata    502    502      0
inode_cache        10822  11368    584   28    4 : tunables    0    0    0 : slabdata    406    406      0
dentry            204149 872571    192   21    1 : tunables    0    0    0 : slabdata  41551  41551      0
names_cache           58     82   4096    8    8 : tunables    0    0    0 : slabdata     12     12      0
buffer_head        89106  95628    104   39    1 : tunables    0    0    0 : slabdata   2452   2452      0
uts_namespace         72     72    440   18    2 : tunables    0    0    0 : slabdata      4      4      0
nsproxy              292    292     56   73    1 : tunables    0    0    0 : slabdata      4      4      0
mm_struct            936    936    896   18    4 : tunables    0    0    0 : slabdata     52     52      0
files_cache         1104   1104    704   23    4 : tunables    0    0    0 : slabdata     48     48      0
signal_cache        1860   1860   1088   30    8 : tunables    0    0    0 : slabdata     62     62      0
sighand_cache        915    915   2112   15    8 : tunables    0    0    0 : slabdata     61     61      0
task_struct          520    596   7424    4    8 : tunables    0    0    0 : slabdata    149    149      0
cred_jar           12099  12306    192   21    1 : tunables    0    0    0 : slabdata    586    586      0
anon_vma_chain     22728  24448     64   64    1 : tunables    0    0    0 : slabdata    382    382      0
anon_vma           15329  15640     88   46    1 : tunables    0    0    0 : slabdata    340    340      0
pid                 3904   3904    128   32    1 : tunables    0    0    0 : slabdata    122    122      0
trace_event_file     966    966     88   46    1 : tunables    0    0    0 : slabdata     21     21      0
ftrace_event_field   3485   3485     48   85    1 : tunables    0    0    0 : slabdata     41     41      0
pool_workqueue       500    512    256   16    1 : tunables    0    0    0 : slabdata     32     32      0
radix_tree_node     7032  12712    584   28    4 : tunables    0    0    0 : slabdata    454    454      0
task_group           150    150    640   25    4 : tunables    0    0    0 : slabdata      6      6      0
kmalloc-8192          96     96   8192    4    8 : tunables    0    0    0 : slabdata     24     24      0
kmalloc-4096         843    865   4096    8    8 : tunables    0    0    0 : slabdata    109    109      0
kmalloc-2048        1381   1488   2048   16    8 : tunables    0    0    0 : slabdata     93     93      0
kmalloc-1024        3258   3640   1024   16    4 : tunables    0    0    0 : slabdata    229    229      0
kmalloc-512         3651   4240    512   16    2 : tunables    0    0    0 : slabdata    271    271      0
kmalloc-256         2498   2608    256   16    1 : tunables    0    0    0 : slabdata    163    163      0
kmalloc-128       213344 951264    128   32    1 : tunables    0    0    0 : slabdata  29727  29727      0
kmem_cache_node     4245   4384    128   32    1 : tunables    0    0    0 : slabdata    137    137      0
kmem_cache          3899   3990    384   21    2 : tunables    0    0    0 : slabdata    190    190      0

latest updates:

  • unplugging the cameras from USB with and without restarting the process and with and without restarting the container does not free the memory
  • attempting modprobe -r uvcvideo within the container does not work
  • attempting modprobe -r uvcvideo within the HostOS does not free the memory

Next attempts:

  • 32bit balenaOS
  • 32bit Raspberry OS