My team is trying to resolve an oracle issue we're seeing sporadically that is really causing us trouble.
We have an update process that imports .dmp files via a package that utilizes an Apply job to do so. (not our choice to implement it this way…)
The problem is, we're getting this apply job aborted with an error of 4031 unable to allocate <100 bytes (various values, usually from 60 to 90) from memory, with the shared pool marked as the culprit. ASMM is on. The server is provisioned ~500 MB of memory to run on (these are old Server 2003 environments so memory is in high demand). I am concerned about this error – it seems to me that the error should always report multiples of 4k, as that is our granule size, but perhaps I am mis-understanding the process of freeing up memory. I am also concerned that the server is not freeing up packages as needed.
I believe that the fix is to use dbms_shared_pool.keep("PACKAGE NAME") to pin the package used to run the apply job, so that it never has to allocate the memory again, but we need to test this fix before we deploy it, for various organizational reasons. We also plan to allocate more memory to the oracle database. Setting the minimum size is not a solution, as we've seen this issue when the size of the pool is above the minimum, with ASMM on. We've also tried setting the sga maximum size to a higher value, and pinning the package in question, but again, with no way to reproduce the issue, it is a waiting game to see if it ever happens again.
Our main problem is in reliably reproducing the problem – even when disabling ASMM and setting the shared pool as small as possible, we're not able to reliably reproduce the issue. We've gotten the shared pool small enough that the web interface can't even pull image tags out of the database, or query the login information from the appropriate table, yet the apply job still runs reliably.
We do not have an in-house DBA so naturally we're probably missing some obvious steps.
This issue is present sporadically on 2-3 servers in a 8 server environment. These individual servers all see similar use by different groups. I am unclear as to why only some of the servers encounter this problem, but have not been able to identify any smoking gun cause.
I am hoping for input in either reliably avoiding this issue entirely, or in reliably forcing the shared pool to be unable to allocate bytes to this job, so that we can confirm these fixes actually work.
Please bear in mind that I have read most of the 4031 posts on SO at this point – this problem has been under investigation for weeks. It is likely NOT appropriate to mark this question as duplicate and direct me to another question – I've read it already. It did not answer my question.This is a cross post from plain old Stackoverflow
I have examined the trace files for one of the servers but it was less than enlightening – I am unsure of what to look for.
We've tried flushing the shared pool, but again, with no way to reproduce the issue it turns into an "wait and see" fix, which isn't the answer that our funder is really interested in.