Check the actual ulimits for all the running Oracle instances

I’ve read the recent good post from my friend Rene on Pythian’s Blog about how to troubleshoot when user ulimits are different from what specified in limits.conf:

Quick Tip : Oracle User Ulimit Doesn’t Reflect Value on /etc/security/limits.conf

I would like to add my 2 cents:

Once you fix the problem, you may want to check (any maybe monitor) when an instance is running with a wrong value (and maybe encounter the famous Error message: Linux-x86_64 Error: 23: Too many open files in system).

This single line gives you an overview of all your instances at once:

If you find any wrong values, plan a restart before you encounter any error during peak hours!

Ludo

 

 

Smart Bash Prompt for Oracle

If you are an Oracle customer who has several database versions running, you have to deal with scripts that become more and more complex to maintain. Depending on the version or the edition of your database, you may want to run different pieces of code. This forces you to get programmatically more information about your database version and edition (e.g., in order to run a statspack or AWR report if your software is either Enterprise or Standard).
The most common way to get information about the software is connecting to the database and getting it through a couple of selects. But what if you don’t have any running databases?
The ORACLE_HOME inventory has such information, and you can get it with a short shell function:

The snippet searches for a patchset entry in comps.xml to get the patch version rather than the base version (for releases prior to 11gR2 where out-of-place patching occurs). If a patchset cannot be found, it looks for the base version. Depending on the major release, the information about the edition is either in globalvariables.xml (11g, 12c) or in context.xml (10g).
When you call this “ohversion” function, you get both the Oracle version and the edition of your current ORACLE_HOME.
If you’re using the bash as user shell, you may want to take one step forward and  include this information in a much fancier bash prompt than the prompt by default:

2015_06_05_16_22_56_sso0419iAlthough this prompt may seem long, it has several advantages that save you a lot of typing:
• The newline character inside the prompt let’s you start typing commands on an almost empty line so you don’t have to worry about how long your command is.
• The full username@host:path can be copied and pasted quickly for scp commands.
• The time inside the square brackets is helpful to track timings.
• The indication of the current environment (version, edition, SID) lets you know which environment you’re working on.
• The leading number is the exit code of the last command ($?). It’s green when the exit code is zero and red for all other exit codes.
• Hash characters before and after the prompt mitigate the risk of copying and pasting the wrong line by mistake inside your session.

Note: this post originally appeared on IOUG Tips & Best Practices Booklet 9th edition.

Oracle Instances and real memory consumption on Linux and Solaris

There’s a way to know the REAL memory usage by Oracle Instance, including all connecting processes and using the shell rather than a connection to oracle?

The short answer is “I think so” 🙂

Summing up RSS column from ps output, is not reliable because Linux uses a copy-on-write on process forks and also doesn’t take into account correctly the shared memory and other shared allocations.

I’ve come across this post on Pythian’s Blog from Marc Billette.

While it seems good I’ve had discording results depending on platform and release.

Instead, I’ve tried to create a shell snippet that always uses pmap but works differently and SEEMS to work correctly on Linux ans Solaris.

Basically, using the pmap script I get a lot of information about the different memory areas allocated to the process:

 

Initially I’ve tried to decode correctly the different kinds of memory the same way other scripts I’ve found online do:

but finally the ADDRESS is the same from different processes when the memory area is shared, so my script now just get a unique line for each address and sums up the memory size (not the rss one!):

This should give the total virtual memory allocated by the different Oracle instances.

The results I get are plausible both on Linux and Solaris.

Example:

If you find any error let me know and I’ll fix the script!

Ludovico

Oracle Database 12c: Multithreaded Execution (or how make processes decrease)

http://subeteanime.blogspot.ch/ (cc) Too many background processes

Oracle instances on Unix/Linux servers have been composed historically by separated server processes to allow the database to be multi-user, in opposite with Windows that has always been multithread (Oracle 7 on MS-DOS was a single-user process, but this is prehistory…). The background processes number has increased to support all the new features of Oracle, up to this new Oracle 12c release. On a simple database installation you’ll be surprised to have this output from a ps command (38 processes):

If you have consolidated many databases without the pluggable database feature, you’ll end up to have several hundreds of processes even without users connected. But Oracle 12c now introduce the possibility to start an instance using multithreading instead of the traditional processes. This could lead to some optimizations due to the shared process memory, and reduced context switches overhead, I presume (need to test it).

 

Enabling the Multithreaded Execution

By default this feature is not enabled, so you have to set it explicitly:

And in parallel, you’ll need to add this line to the listener.ora:

After a restart, the instance will show only a bunch of processes:

The remaining processes

So we have the Process Monitor (pmon), the Process Spawner (psp0), the Virtual Keeper of Time (vktm), the Database Writer (dbw0) and two new multithreaded processes (u004) and (u005). “U” can stand for User or Unified?

 

Where can I find the information on the other processes?

They still exist in the v$process view, thus leading to some confusion when talking about Oracle Processes with your sysadmins… The new EXECUTION_TYPE column show if the Oracle Process is executed as a thread or as an OS process, and the SPID let us know which process actually executes it.

 

What about the User processes?

Well, I’ve spawned 200 user processes with sqlplus, and got 200 threads:

On the OS side, I’ve registered an additional process to distribute the load of the new user processes. Damn, I start to being confusional using the term “process” o_O

On the session side however, all the user processes are DEDICATED.

 

 A huge side effect

By using the multithreaded execution, the operating system authentication doesn’t work.

Unless Oracle will review it’s authentication mechanism in a future patchset, you’ll need to rely on the password file and use the password to connect to the instance as sysdba, even locally.

What about performance?

In theory, threads should be faster and with a lower footprint:

The main benefit of threads (as compared to multiple processes) is that the context switches are much cheaper than those required to change current processes. Sun reports that a fork() takes 30 times as long as an unbound thread creation and 5 times as long as a boundthread creation.

http://www.princeton.edu/~unix/Solaris/troubleshoot/process.html

and

In some operating systems running on some hardware, switching between threads belonging to the same process is much faster than switching to a thread from different process (because it requires more complicated process context switch).
http://en.wikipedia.org/wiki/Thread_switching_latency

In practice, I’ll do some tests and let you know! 🙂

 

What about the good old OS kill command to terminate processes?

Good question! Currently I have not found any references to an orakill command (that exists on Windows). Hope it will arrive soon!

Cheers

Ludo

How to collect Oracle Application Server performance data with DMS and RRDtool

RRDize everything, chapter 1

If you are managing some Application Server deployments you should have wondered how to check and collect performance data.
As stated in documentation, you can gather performance metrics with the dmstool utility.
AFAIK, this can be done from 9.0.2 release upwards, but i’m concerned DMS will not work on Weblogic.

Mainly, you should have an external server that acts as collector (it could be a server in the Oracle AS farm as well): copy the dms.jar library from an Oracle AS installation to your collector and use it as you would use dmstool:

There are three basilar methods to get data:

Get all metrics at once:

Get only the interesting metrics:

Get metrics included into specific DMS tables:

What youraddress:// is, it depends on the component you are trying to connect:

If you are trying to connect to the OHS (Apache), be careful to allow remote access from the collector by editing the dms.conf file.

Now that you can query dms data, you should store it somewhere.
Personally, I did a first attempt with dmstool -dump format=xml. I wrote a parser in PHP with SimpleXML extension and I did a lot of inserts into a MySQL database. After a few months the whole data collected from tens of servers was too much to be mantained…
To avoid the maintenance of a DWH-grade database I investigated and found RRDTool. Now I’m asking how could I live without it!

I then wrote a parser in awk that parse the output of the dms.jar call and invoke an rrdtool update command.
I always use dms.jar -table command. The output has always the same format:

So I written an awk file that works for me.
use it this way:

And this is the code for update_metric_rrd:

Once you have all your rrd files populated, it’s easy to script automatic reporting. You would probably want a graph with the request count served by your Apache cluster, along with its linear regression:

This is the result:
OHS request completed
OHHHHHHHHHHHH!!!! COOL!!!!

That’s all for DMS capacity planning. Stay tuned, more about rrdtool is coming!

Awk snippet to count TCP sockets grouped by state

Depending on the release of awk it could be:

I saved the script as netstat_c.
I have to filter my netstat output to match only my tcp sockets prior to pipe the output to the script.

On linux:

This is great to check my webserver connections when I do stress tests.

Quick Oracle Dataguard check script

Oracle Dataguard has his own command-line dgmgrl to check the whole dataguard configuration status.
At least you should check that the show configuration command returns SUCCESS.

This is an hypothetic script:

Another script should check for the gap between production online log and the log stream received by the standby database. This can be accomplished with v$managed_standby view.
The Total Block Gap between production and standby can be calculated this way:
Sum all blocks from v$archived_logs where seq# between Current Standby Seq# and Current Production Seq#. Then add current block# of the production LGWR process and subtract current block# from RFS standby process. This gives you total blocks even if there is a log sequence gap between sites.
This is NOT the gap of online log APPLIED to the standby database. THIS IS THE GAP OF ONLINE LOG TRANSMITTED TO THE STANDBY RFS PROCESS and can be used to monitor your dataguard transmission from production to disaster recovery environment.

This is an excerpt of such script (please take care that it does not check against RFS failures, so it can fails when RFS is not alive):

Any comment is appreciated!

Tips: Bash Prompt and Oracle

You may want to check the NEW VERSION of this prompt here.

I disagree with default bash prompt. Do you? It’s quote common to work with long paths:

and, when working on multi-database environments I need to check my environment:

I currently use this prompt, instead:

What is ohvers?? I defined this function to get the version of oracle from my ORACLE_HOME variable:

Pros:

  • I have a blank line that separate my prompt from previous output
  • I get the system clock (useful when saving my konsole history. Did I say konsole?)
  • I can see my Oracle Environment before launching dangerous commands
  • I have an empty line to start my endless commands
  • I have a lot of sharps “#” : they are fine against wrong copy&paste operations…

Suggestions?

Oracle RAC Standard Edition to achieve low cost and high performance

I finished today to create a new production environment based on 2 Linux serverX86_64 and running Oracle RAC 10gR2. (I know, there is 11g right now, but I’m a conservative!)
Wheeew, I just spent a couple of hours applying all the recommended patches!
We choosed 2 nodes with a maximum of 2 multi-core processors each one so we can license Standard Edition instead of Enterprise Edition. 64bits addressing allow us to allocate many gigabytes of SGA. I’m starting with 5Gb but I think we’ll need more. And a set of 6x300Gb 15krpms disks (it can be expanded with more disks and more shelves).
This configuration keeps low the total cost of ownership but achieves best performance.
Due to disks layout, costs and needed usable storage, we had to configure one huge RAID5 on the SAN with multi-path. I decided anyway to create 2 ASM disk groups (ASM is mandatory for Standard Edition RAC), one for the DB, the second one for the recovery area. With spare disks we should have enough availability and even if it’s a RAID5 I saw good write performances (>150M/s).

Welcome new RAC, I hope we’ll feel good together!

It’s time to trouble…

Sometimes it’s hard to find enough time to write something or even to only THINK about writing something

The following are the projects I have to complete before the deadline of December 17th (at least if I still want to go on vacation…)

  • A totally new Oracle 10gR2 RAC SE on Linux (OCFS2, ASM) including jboss frontends, backups, monitoring, documentation. (Servers are ready today).
  • A Disaster recovery architecture based on Dataguard with scripts based on rsync to do filesystem replication, with failover and failback, including backups, monitoring, documentation. (The server in DR site is reachable via network today).
  • A 17 server infrastructure (among others a RAC 10gR2 on linux) transfer from Milan datacenter to here. It’s planned for december 11th but I have to crosscheck backup and contingency requirements.
  • A 14 server infrastructure (based on Windows and SqlServer) transfer from Milan datacenter to here. To be planned in december.
  • A totally new cold failover cluster based on linux with Oracle DBMS and E-business suite (Servers will be provided soon, I hope!).
  • A new standalone Windows Server 64bit to outstand the 32bit allocation bottleneck for a 500Gb oracle database (Server will be provided not before december 10th).
  • Normally manage the day-by-day work, including replying to e-mails and answering the phone.

AARGH!!