{"id":1894,"date":"2019-11-22T14:52:05","date_gmt":"2019-11-22T12:52:05","guid":{"rendered":"http:\/\/www.ludovicocaldara.net\/dba\/?p=1894"},"modified":"2020-08-18T15:59:45","modified_gmt":"2020-08-18T13:59:45","slug":"oracle-hugepages-usage-linux","status":"publish","type":"post","link":"https:\/\/www.ludovicocaldara.net\/dba\/oracle-hugepages-usage-linux\/","title":{"rendered":"Checking usage of HugePages by Oracle databases in Linux environments"},"content":{"rendered":"<p>Yesterday several databases on one server started logging errors in the alert log:<\/p>\n<pre class=\"lang:plsql highlight:0 decode:true \">ORA-00603: ORACLE server session terminated by fatal error\r\nORA-27504: IPC error creating OSD context\r\nORA-27300: OS system dependent operation:sendmsg failed with status: 105\r\nORA-27301: OS failure message: No buffer space available\r\nORA-27302: failure occurred at: sskgxpsnd2<\/pre>\n<p>That means not enough contiguous free memory in the OS. The first thing that I have checked has been of course the memory, and the used huge pages:<\/p>\n<pre class=\"lang:plsql highlight:0 decode:true \"># [ oracle@oraserver1:\/home\/oracle [10:45:46] [19.3.0.0.0 [GRID] SID=GRID] 0 ] #\r\n$ free\r\n              total        used        free      shared  buff\/cache   available\r\nMem:      528076056   398142940     3236764   119855448   126696352     5646964\r\nSwap:      16760828    11615324     5145504\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [10:46:47] [19.3.0.0.0 [GRID] SID=GRID] 0 ] #\r\n$ cat \/proc\/meminfo | grep Huge\r\nHugePages_Total:   180000\r\nHugePages_Free:    86029\r\nHugePages_Rsvd:    11507\r\nHugePages_Surp:        0\r\nHugepagesize:       2048 kB<\/pre>\n<p>The memory available (last column in the <code>free<\/code> command) was indeed quite low, but still plenty of space in the huge pages (86k pages free out of 180k).<\/p>\n<p>The usage by Oracle instances:<\/p>\n<pre class=\"lang:plsql highlight:0 decode:true \"># [ oracle@oraserver1:\/home\/oracle [10:45:39] [19.3.0.0.0 [GRID] SID=GRID] 0 ] #\r\n$ sh mem.sh\r\nDB12 : 54081544\r\nDB22 : 37478820\r\nDB32 : 67970828\r\nDB42 : 14846552\r\nDB52 : 16326380\r\nDB62 : 15122048\r\nDB82 : 56900472\r\nDB92 : 14401080\r\nDBA2 : 12622736\r\nDBB2 : 14379916\r\nDBC2 : 46078336\r\nDBD2 : 46137728\r\nDB72 : 37351336\r\ntotal :  433697776<\/pre>\n<p><a href=\"https:\/\/www.ludovicocaldara.net\/dba\/real-memory-usage-on-linux\/\">You can get the code of mem.sh in this post.<\/a><\/p>\n<p>Regarding pure shared memory usage, the situation was what I was expecting:<\/p>\n<pre class=\"lang:plsql highlight:0 decode:true \">$ ipcs -m | awk 'BEGIN{a=0} {a+=$5} END{print a}'\r\n369394520064\r\n<\/pre>\n<p>360G of shared memory usage, much more than what was allocated in the huge pages.<\/p>\n<p>I have compared the situation with the other node in the cluster: it had more memory allocated by the databases (because of more load on it), more huge page usage and less 4k pages consumption overall.<\/p>\n<pre class=\"lang:plsql highlight:0 decode:true\">$ sh mem.sh\r\nDB12 : 78678000\r\nDB22 : 14220000\r\nDB32 : 14287528\r\nDB42 : 12369352\r\nDB52 : 14868596\r\nDB62 : 14633984\r\nDB82 : 54316104\r\nDB92 : 86148332\r\nDBA2 : 61473288\r\nDBB2 : 68678788\r\nDBC2 : 9831288\r\nDBD2 : 64759352\r\nDB72 : 68114604\r\ntotal :  562379216\r\n\r\n$ free\r\n              total        used        free      shared  buff\/cache   available\r\nMem:      528076056   402288800    17100464     5818032   108686792   114351784\r\nSwap:      16760828       47360    16713468\r\n\r\n$ cat \/proc\/meminfo | grep Huge\r\nAnonHugePages:     10240 kB\r\nHugePages_Total:   176654\r\nHugePages_Free:    15557\r\nHugePages_Rsvd:    15557\r\nHugePages_Surp:        0\r\nHugepagesize:       2048 kB\r\n<\/pre>\n<p>So I was wondering if all the DBs were property allocating the SGA in huge pages or not.<\/p>\n<p><a href=\"https:\/\/access.redhat.com\/solutions\/320303\">This redhat page has been quite useful<\/a> to create a quick snippet to check the huge page memory allocation per process:<\/p>\n<pre class=\"lang:sh decode:true\"># [ oracle@oraserver1:\/home\/oracle [10:55:27] [19.3.0.0.0 [GRID] SID=GRID] 0 ] #\r\n$ cat \/proc\/707\/numa_maps | grep -i hug\r\n60000000 default file=\/SYSV00000000\\040(deleted) huge dirty=1 mapmax=57 N0=1 kernelpagesize_kB=2048\r\n70000000 default file=\/SYSV00000000\\040(deleted) huge dirty=1525 mapmax=57 N0=743 N1=782 kernelpagesize_kB=2048\r\nc60000000 interleave:0-1 file=\/SYSV0b46df00\\040(deleted) huge dirty=1 mapmax=57 N0=1 kernelpagesize_kB=2048\r\n\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [10:56:39] [19.3.0.0.0 [GRID] SID=GRID] 0 ] #\r\n$ function pshugepage () {\r\n&gt; HUGEPAGECOUNT=0\r\n&gt; for num in `grep 'huge.*dirty=' \/proc\/$@\/numa_maps | awk '{print $5}' | sed 's\/dirty=\/\/'` ; do\r\n&gt; HUGEPAGECOUNT=$((HUGEPAGECOUNT+num))\r\n&gt; done\r\n&gt; echo process $@ using $HUGEPAGECOUNT huge pages\r\n&gt; }\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [10:57:09] [19.3.0.0.0 [GRID] SID=GRID] 0 ] #\r\n$ pshugepage 707\r\nprocess 707 using 1527 huge pages\r\n\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [10:57:11] [19.3.0.0.0 [GRID] SID=GRID] 0 ] #\r\n$ for pid in `ps -eaf | grep [p]mon | awk '{print $2}'` ; do pshugepage $pid ; done\r\nprocess 707 using 1527 huge pages\r\nprocess 3685 using 2409 huge pages\r\nprocess 16092 using 3056 huge pages\r\nprocess 55718 using 0 huge pages\r\nprocess 58490 using 0 huge pages\r\nprocess 70583 using 0 huge pages\r\nprocess 94479 using 1135 huge pages\r\nprocess 98216 using 0 huge pages\r\nprocess 98755 using 0 huge pages\r\nprocess 100245 using 0 huge pages\r\nprocess 100265 using 0 huge pages\r\nprocess 100270 using 0 huge pages\r\nprocess 101681 using 0 huge pages\r\nprocess 179079 using 1699 huge pages\r\nprocess 189585 using 14566 huge pages<\/pre>\n<p>It has been easy to spot the databases not using huge pages at all:<\/p>\n<pre class=\"lang:plsql highlight:0 decode:true \"># [ oracle@oraserver1:\/home\/oracle [10:58:26] [19.3.0.0.0 [GRID] SID=GRID] 0 ] #\r\n$ ps -eaf | grep [p]mon\r\noracle      707      1  0 Sep30 ?        00:23:55 ora_pmon_DB12\r\noracle     3685      1  0 Nov01 ?        00:09:17 ora_pmon_DB22\r\noracle    16092      1  0 Oct15 ?        00:04:15 ora_pmon_DB32\r\noracle    55718      1  0 Aug12 ?        00:08:25 asm_pmon_+ASM2\r\noracle    58490      1  0 Aug12 ?        00:08:24 apx_pmon_+APX2\r\noracle    70583      1  0 Aug12 ?        00:57:55 ora_pmon_DB42\r\noracle    94479      1  0 Oct02 ?        00:32:03 ora_pmon_DB52\r\noracle    98216      1  0 Aug12 ?        00:58:36 ora_pmon_DB62\r\noracle    98755      1  0 Aug12 ?        00:59:27 ora_pmon_DB82\r\noracle   100245      1  0 Aug12 ?        00:56:52 ora_pmon_DB92\r\noracle   100265      1  0 Aug12 ?        00:51:54 ora_pmon_DBA2\r\noracle   100270      1  0 Aug12 ?        00:54:57 ora_pmon_DBB2\r\noracle   101681      1  0 Aug12 ?        00:56:55 ora_pmon_DBC2\r\noracle   179079      1  0 Sep10 ?        00:35:17 ora_pmon_DBD2\r\noracle   189585      1  0 Nov01 ?        00:09:34 ora_pmon_DB72<\/pre>\n<p>Indeed, after stopping them, the huge page usage has not changed:<\/p>\n<pre class=\"lang:plsql highlight:0 decode:true\"># [ oracle@oraserver1:\/home\/oracle [11:01:52] [11.2.0.4.0 [DBMS EE] SID=DB62] 1 ] #\r\n$ srvctl stop instance -d DB6_SITE1 -i DB62\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:02:24] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl stop instance -d DB4_SITE1 -i DB42\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:03:29] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl stop instance -d DB8_SITE1 -i DB82\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:06:36] [11.2.0.4.0 [DBMS EE] SID=DB62] 130 ] #\r\n$ srvctl stop instance -d DB9_SITE1 -i DB92\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:07:16] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl stop instance -d DBA_SITE1 -i DBA2\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:07:56] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl stop instance -d DBB_SITE1 -i DBB2\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:08:42] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl stop instance -d DBC_SITE1 -i DBC2\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:09:16] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ cat \/proc\/meminfo | grep Huge\r\nHugePages_Total:   180000\r\nHugePages_Free:    86029\r\nHugePages_Rsvd:    11507\r\nHugePages_Surp:        0\r\nHugepagesize:       2048 kB<\/pre>\n<p>But after starting them back I could see the new huge pages reserved\/allocated:<\/p>\n<pre class=\"lang:plsql highlight:0 decode:true\"># [ oracle@oraserver1:\/home\/oracle [11:10:35] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl start instance -d DB6_SITE1 -i DB62\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:12:14] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl start instance -d DB4_SITE1 -i DB42\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:12:54] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl start instance -d DB8_SITE1 -i DB82\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:13:41] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl start instance -d DB9_SITE1 -i DB92\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:14:43] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl start instance -d DBA_SITE1 -i DBA2\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:15:25] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl start instance -d DBB_SITE1 -i DBB2\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:15:54] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ srvctl start instance -d DBC_SITE1 -i DBC2\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:17:49] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ cat \/proc\/meminfo | grep Huge\r\nHugePages_Total:   180000\r\nHugePages_Free:    72820\r\nHugePages_Rsvd:    68961\r\nHugePages_Surp:        0\r\nHugepagesize:       2048 kB\r\n\r\n# [ oracle@oraserver1:\/home\/oracle [11:17:54] [11.2.0.4.0 [DBMS EE] SID=DB62] 0 ] #\r\n$ free\r\n              total        used        free      shared  buff\/cache   available\r\nMem:      528076056   392011828   123587116     5371848    12477112   126250868\r\nSwap:      16760828      587308    16173520<\/pre>\n<p>The reason was that the server has been started without huge pages first, and after a few instances started, the huge pages has been set.<\/p>\n<p>HTH<\/p>\n<p>&#8212;<\/p>\n<p>Ludovico<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Yesterday several databases on one server started logging errors in the alert log: ORA-00603: ORACLE server session terminated by fatal error ORA-27504: IPC error creating OSD context ORA-27300: OS system dependent operation:sendmsg failed with status: 105 ORA-27301: OS failure message: &hellip; <a href=\"https:\/\/www.ludovicocaldara.net\/dba\/oracle-hugepages-usage-linux\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[321,5,326,3,330],"tags":[],"class_list":["post-1894","post","type-post","status-publish","format-standard","hentry","category-aced","category-linux","category-oracle","category-oracledb","category-oracle-inst-upg"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/posts\/1894","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/comments?post=1894"}],"version-history":[{"count":1,"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/posts\/1894\/revisions"}],"predecessor-version":[{"id":1895,"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/posts\/1894\/revisions\/1895"}],"wp:attachment":[{"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/media?parent=1894"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/categories?post=1894"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ludovicocaldara.net\/dba\/wp-json\/wp\/v2\/tags?post=1894"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}