-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] When calling switch_mode_and_capture_request
I'm getting an error OSError: [Errno 12] Cannot allocate memory
#1102
Comments
I don't really understand why you're getting CMA allocation failures, because a Pi 5 shouldn't be using CMA at all. Can you paste the output of Is there anything else much running while you are using this camera script? |
hi @davidplowman thanks for your quick response. Here is the result from dma_heap I'm not specifically running anything else, it's a very basic installation and this is the only thing I'm running. |
Well, that's wrong. On a Pi 5, Do you have any idea how that's happened? Have you installed something that might have done this? If you have a spare SD card lying around, maybe try installing a clean Raspberry Pi OS and check that it links to |
hmm. that's interesting, I've got 6 PIs here, and just looked on a few others, and they have Is there a way to modify this to 'system' to see if it fixes this error (while I still have the PI booted up & erroring) |
Well, I think it's only a symbolic link. Presumably you could go in as root and recreate it. But just be sure that there's nothing there you can't recover, just in case it never boots again. So I think that file system does get created when you boot, but I've never heard of it pointing to the wrong thing. It would be interesting if rebooting fixes the link, or not. |
I've been reluctant to reboot, as I've been trying to replicate this error (which happened a few times in production) this is the first time I've seen it locally on any of the PIs I've got here. I have rebooted, and it has gone back to 'system', and the script runs, I've tried running it about 10 times and it seems ok, but as I mentioned before it's intermittent, and will very likely happen again. |
I suppose the interesting thing will to be to see if it switches spontaneously to |
I've been trying to replicate this all day, and almost gave up, but then rebooted the PIs and ran the script, it failed on two of the PIs and when checking the Can you think of anything that might be causing this? I've googled I've added more logging into my software to display this |
That's so weird. I'll ask around here at Pi Towers. |
It happened again this morning, after rebooting 3 times (I have 6 PIs here). And it has been a different PI each time it has occurred, so I don't think it's anything dodgy with a particular PI. Are there any specific logs you'd be interested to see? or any other device information? And, this is before I've done anything with PICamera2, I've just been SSHing onto the machines to get the vidbuf_cache value. |
Yes, a dmesg log would be great. We'll see if anyone else suggests anything more. |
Here is the
...error, as this is just me observing that the symbolic link is pointing to |
Actually, maybe post a "good" version of dmesg as well, we can look for any differences. |
I've rebooted the same machine, and it was back to I did try this yesterday, but it's quite difficult to compare, because the log messages are in a slightly different order. |
Could you also attach the contents of |
This is the good one, I'll have to wait for it to happen again.. I'll try some re-boots
|
I've rebooted three times, and a different PI was set to This is the cpuinfo for that PI (i.e. when it's in the 'bad' state) I can grant you (or someone in your team) access to connect to the PI if you'd like to investigate further?
|
The destination of the symlink is supposed to be determined at boot by looking at the I wonder if there might be some timing issues, and it depends on exactly when system performs that action. I notice in your dmesg logs that the camera comes up earlier in the "bad" log. I wonder if there's any kind of pattern there. Out of interest, have you ever noticed this with a Raspberry Pi camera? Though with smaller images you might not notice without checking explicitly - because the CMA area will be big enough. |
OK, here are a couple more ideas. First of all can you add |
I wonder if this happens quite frequently on PIs, but rarely gets noticed. I was originally just capturing in either 1/2 or full resolutions, it's only since I've been starting in 1/2, then capturing in full that this issue has become apparent. I'm guessing it takes a bit more memory to switch modes rather than just starting with full and capturing in the same mode. I have a few HQ cameras, but I doubt they'd error, as they take up a lot less memory, so I'm guessing that even if it was set to The memory errors I'm getting are a symptom of the vidbuf_cached issue, rather than the cause (I believe) The tests I've been doing today haven't touched the cameras, although they're all still connected. |
I'm still not seeing anything in the logs jump out and me, I'm afraid, and I'm also not here next week. But it would be interesting to know if you happen to notice any more patterns, or whether you see it with other cameras (now that you know how to check right after boot if it's gone wrong). In the meantime, I'll ask folks here to keep a lookout for this, and we can perhaps establish how widespread it is. |
I can reproduce the issue and have managed to get some useful output. I think what's happening is that the udev rule runs before the 'system' node shows up, so creating the symlink fails and it falls through to the other case. Not 100% sure, still investigating. Edit: Nope, that doesn't make sense... |
The right thing happens and then the symlink gets replaced.
So I think it's a race condition between whether |
@davidplowman I'm guessing that it is quite widespread, but unless you're running something that will fail due to this symlink, you won't ever know. I've got 6 PIs here, and I'd say that it happens to 1 PI about once every 3 power cycles, so maybe a 1/20 ish chance, which seem in line with what we were seeing in production where we had 40 PIs and 1 or 2 of them would fail. @XECDesign I'm happy that you were able to reproduce, so at least it's not just me... and I've just seen your latest comment whilst typing. Do you think there might be a config change I could make to prevent this? |
I don't know what the right fix is right now, but to temporarily work around the issue you can edit
Then when I know the right fix, I can push that out and your change will be overwritten. |
Testing with My pi is set up to keep rebooting until it the issue occurs. It has been going for a while, so maybe that's the fix. I think the rule is meant to look like this:
|
My
Or did you just not include this? |
I removed everything else because I don't think it's needed. What I pasted above should be the whole file. |
It's looking promising, I've replaced my rules file with what you've posted above on all 6 PIs, and I've done about 7 power cycles without failure (i.e. it's always set to Thank you so much for this (both of you) As we have 60+ PIs in various places, should I wait for an official release, or do these things take a long time to go through QA/Release etc? If it will take some time, I can write a script which replaces this file via an SSH script, and update them all remotely. |
I would expect the fix to ship some time next week |
Just tested that it works as intended on older pi models as well and shipped the update. Many thanks for spotting and helping track down the issue. It would've caused a lot of other problems which could've been written off as "couldn't reproduce it, you must be doing something wrong". |
Thank you @XECDesign and @davidplowman for taking the time to look into this and resolving it so quickly. Is this now just a matter of doing an |
|
Hello, using an imx519 on RPI4 I see similar issue. It was working in the past.
I did full-upgrade, but did not change anything. app started:
switching back and forth between modes here. CmaFree is between 300MB and 150MB until it drops further and now it runs out of memory
camera backend restarted to recover:
|
@mgineer85 Hi, and thanks for the report. The problem under discussion in this thread only occurs on Pi 5, so yours must be quite different. Please file a new report for a new problem. Be sure to state: your model of Pi, the camera, and confirm that everything is up to date. If possible, please also include a short script (no more than a dozen lines, if possible) that demonstrates the failure. Thanks! |
I start to give up on using Arducams because of their poor integration. |
There have been a few kernel related changes recently because of the Hailo device and the new AI camera, but hopefully things are going to settle down a bit now, so once Arducam have caught up again maybe things will stay working for a bit longer. (As I'm sure you know, we encourage vendors to upstream support for their cameras, and then this kind of thing would happen less.) |
I have a python script which starts a camera in 1/2 resolution, then when I'm ready, I switch to full resolution and capture. This works for the majority of the time, but occasionally I'll get a memory error.
When this error occurs, it will keep occurring until I reboot (restarting the script is insufficient), however I can happily run my test script multiple times without issue (on the same PI) so something is getting it into a state.
I am unable to replicate consistently, I have to wait for it to error, then do some tests.
I've simplified my code, and condensed it down to a simple
python
file.This results in the following output
If I look at
dmesg
I can see some cma_aloc errors, and at one point adma
error.I'm running with the following hardware
Raspberry Pi 5 (4gb memory)
OS - Debian GNU/Linux 12 (bookworm)
Uname - 6.6.31+rpt-rpi-2712
Camera - Arducam 64mp (Hawkeye)
Free
resultslibcamera-still --version
resultsThe text was updated successfully, but these errors were encountered: