BP and Patch 22652097: set optimizer_adaptive_statistics to FALSE explicitly or it might not work!

Update 14.03.2018: After some exchanges with Nigel Bayliss, the behaviour described here has been filed as unpublished bug 27626925: OPTIMIZER ADAPTIVE STATS DEFAULT FALSE NOT HONORED WHEN ENABLED IN OCT OR JAN BP. It will be fixed starting with April’s bundle patch.

 

According to Nigel’s blog post:

The Oracle 12.1.0.2 October 2017 BP and the Adaptive Optimizer

if you installled the patch 22652097 prior to apply the Bundle Patch 171018, the BP apply in the database should recognize that the patch was already in place and keep it activated. This is done through the fix control 26664361.

When fix_control 26664361:0 -> Patch 22652097 is not enabled: the parameter optimizer_adaptive_features (OAF) works

When fix_control 26664361:1 -> Patch 22652097 is enabled; optimizer_adaptive_features is discarded and the two new parameters have the priority: optimizer_adaptive_plans (OAP) and optimizer_adaptive_statistics (OAS).

But at my customer, I had another behavior.

My patching story might be very similar to yours!

When I started upgrading my customer’s database to 12c in early 2015, I experienced very soon the infamous problems with SQL Plan Directives (SPD) and Adaptive Dynamic Sampling (ADS) that I described in my paper: ADAPTIVE FEATURES OR: HOW I LEARNED TO STOP WORRYING AND TROUBLESHOOT THE BOMB .

Early fixes

When I was new to the problem, the quick fix for the problematic applications was to set OAF to FALSE.

Later, I discovered some more details and decided to opt for setting:

In other cases, I disabled the specific directives that were causing problems.

But many databases did not have so many problems, and I left the defaults.

Patch 22652097 on top of BP170718 

At some point, me and my customer decided to apply the fix 22652097, on top of BP170718 that was our current patch level at that time.

The patch installation on a test database was complaining about the optimizer_adaptive_feature set: this parameter was not used anymore. This issue is nicely explained by Flora in her post Patch 22652097 in 12.1 makes optimizer_adaptive_features parameter obsolete.

In order to apply that patch on the remaining databases, we did:

  • alter system reset optimizer_adaptive_features;
  • alter system reset “_optimizer_dsdir_usage_control”;
  • Applied the patch on binaries and datapatch on the databases.

The result at this point was that:

  • optimizer_adaptive_features was not set
  • optimizer_adaptive_plans was set to true
  • optimizer_adaptive_statistics was set to false.

It might seems superflous to say, but it’s not, the SQL Plan Directives were not used anymore: no Adaptice Dynamic Sampling and no performance problems.

Bundle Patch 180116

Three weeks ago, we installled the last Bundle Patch in order to fix some Grid Infrastructure problems, and the BP, as described in Nigel’s note (and Mike Dietrich and many other bloggers :-)) contains the patch 22652097.

According to Nigel’s post, the patch installation should have detected that the patch 22652097 was already there and activate it.

And indeed, after we applied the BP, the fix_control 26664361 was set to 1 (that means that the patch 22652097 is enabled). So we went live with this setup without additional checks.

One week later, we started experiencing performance problems again. I noticed immediately that the Adaptive Dynamic Sampling was very aggressive again, and the SQL Plan Directives used again.

But the fix was there AND ENABLED!

After a few tests, I realized that the SPD is not used anymore if I set optimizer_adaptive_statistics EXPLICITLY to false.

optimizer_adaptive_statistics must be set explicitly, the default does not work

And here’s the proof:

I use once again the great SPD example by Tim Hall (sorry Tim, it’s not the first time that I steal your work 🙂 ) . You can find here:

SQL Plan Directives in Oracle Database 12c Release 1 (12.1)

After applying the BP, I have the default parameter, not set explicitly, and the fix_control enabled:


If I run the test statement (again, find it here https://oracle-base.com/articles/12c/sql-plan-directives-12cr1) the directives are used:


but then I set the parameter explicitly:

and the SPD usage (and consequently, ADS), are gone:

Conclusion

Set the parameter EXPLICITLY when you apply the BP that contains the fix.

And ALWAYS test the behavior!

You can check how many statements use the dynamic sampling by following this short blog post by Dominic Brooks:

Which of my sql statements are using dynamic sampling?

HTH

The story of ACME and its CRM with serious SQL injections problems

Preface/Disclaimer

This story is real, but I had to mask some names and introduce some minor changes so that real people are not easy to recognize and the whole story does not sound offensive to anyone. This post is not technic, so my non-technical English will be fully exposed. Sorry for the many errors 🙂

ACME, The Company

ACME is a big, global company. It has a huge revenue and there are almost no competitors on the market that get close to it in terms of fame and earnings.

Its core business is heavily supported by its CRM system, that holds all the customers, contracts, prospects, suppliers…

FOOBAR CRM, The CRM system

Despite the CRM is not ACME’s core business, the data in there is really, really precious. Without prospects and customer data, the sales cannot close the deals.

The CRM application (let’s call it FOOBAR CRM) runs on a legacy architecture and it is as old as the company itself.

The architecture is the “old good style” web application that was common in the early 2000’s… : browser front-end (OK, you might think that it is not so old, huh?) , PHP application backed by Apache, MySQL database.

As you can see, quite old but not so uncommon.

One of the big concerns, as in every application lifecycle, is to maintain good code quality. At the beginning of the PHP era, when PHP was still popular, there was a lack of good frameworks (I’m not even sure if there are now, I’m sure Zend Framework was a cool MVC framework but it came out many years later). The result is that now the code maintenance of the application is literally a pain in the a**.

The customer is a noob in development, so when it has been founded and needed a CRM system, the management delegated the development to an external company (let’s call it FOOBAR).

FOOBAR, The software house

The company FOOBAR is as old as the ACME company. Respective founders were relatives: they started the business together and now that the founders left, the partnership is working so well that FOOBAR is also one the biggest resellers of ACME products (despite its business is loosely related to ACME’s business). FOOBAR is also at the same time a partner and a customer, and some member of its board are also part of ACME’s board.

What is important here, is that the advices coming from the “common board members” are considered much more important than the advices coming from ACME’s employees, customers and marketing department.

The code maintenability

ACME has started small, with a small “oldish” CRM system. But some years later ACME experienced a huge increase of customers, product portfolio, employees, revenues etc..

In order to cope with the increasing workload of the application, they scaled everything up/out: there are now tens of web servers nicely load balanced, some webcache servers, and they introduced Galera cluster in conjunction with some replicated servers to scale out the database workload.

The global business of ACME also required to open the FOOBAR CRM application to the internet, exposing it to a wide range of potential attacks.

In order to cope with increasing needs, FOOBAR proposed an increasing number of modules, pieces of code, tools to expand the CRM system. To maximize the profits, FOOBAR decided to employ only junior developers, unexperienced and not familiar at all with development of applications using big RDBMS systems and a very scarse sense of secure programming.

That’s not all!

In order to develop new features faster, ACME and FOOBAR have an agreement that let the end users develop their own modules in PHP code and plug them in the application, most of the times directly in production (you may think: that’s completely crazy, this should NEVER happen in a serious company! You know what? I agree 100%).

Uh, I forgot to mention, the employees that use the CRM application and have some development skills are VERY, VERY happy to have the permission to code on their own, because they can develop features or solve bugfixes on their own, depending on their needs.

Result: the code is completely out of control: few or no unit tests, no integration tests at all, poor security, tons of bugs.

The big SQL Injection problem

Among many bugs, the SQL injection is the most common. It started with some malicious users trying to play around with injection techniques, but now the attacks are happening more and more frequently:

  • The attacks come from many hackers (not related to each other)
  • Some hackers try to get money for that, some other just steal data, some other want just to mess up and low down ACME’s reputation…

everytime an attack is successful, ACME looses more and more contracts (and money).

The fix, up to now, was to track the hacker IP address AFTER the attack and add it to the firewall blacklist (not so clever, huh?).

Possible Solutions (according to the security experts)

ACME mandated an external company to do an assessment. The external company proposed a few things:

  • SOLUTION 1: Change completely the CRM software and use something more modern, modular, secure and developed by a company that hires top talents. There are tons of cloud vendors that offer CRM software as a Service, and other big companies with proven on-premises CRM solutions.
  • SOLUTION 2: Keep the current solution, but with a few caveats:
    • All the code accessing the database must be reviewed to avoid injections
    • only the experienced developers should have the right to write new code (possibly employees of the software house, that will be accountable for new vulnerabilities)
  • SOLUTION 3: Install content-sensitive firewalls and IDS that detect SQL Injection patterns and block them before they reach the web server and/or the database layer.

What the CRM users think

User ALPHA (the shadow IT guy): “We cannot afford to implement any of the solutions: we, as users, need the agility to develop new things for ourselves! And what if there is a bug? If I have to wait a fix from the software house, I might loose customers or contracts before the CRM is available again!”

User BRAVO (the skeptical): “SQL Injection is a complex problem, you cannot solve it just by fixing the current bugs and revoke the grants to develop new code to the non-developers”

User CHARLIE (the lawyer): “When I’ve been hired, I’ve been told that I had the right to drink coffee and develop my own modules. I would never work for a company that would not allow me to drink coffee! Drinking coffee and creating vulnerabilities, are both rights!”

User DELTA (the average non-sense): “The problem is not the vulnerable code, but all those motherf****** of hackers that try to inject malicious code. We should cure mental illness of geeks so they do not transform themselves in hackers.”

User ECHO (the hacker specialist): “If we ask stackoverflow to provide the IP addresses of the people that search for SQL injection code examples, we might preventively block their IP addresses on our external firewall!”

User FOXTROT (the false realist): “Hacker attacks happen, and there’s not much we can do against them. If we fix the code and implement security constraints, there will always be hackers trying to find vulnerabilities. You miss the real problem! We must cure this geeks/hackers insanity first!”

User GOLF (the non-sense paragon): “You concentrate on contracts lost because of SQL Injections, but the food in our restaurant sucks, and our sales also lose contracts because they struggle to fight stomach ache”.

User HOTEL (the denier): “I’ve never seen the logs that show the SQL Injections, I am sure it is a complot of the no-code organizations meant to sell us some WYSIWIG products”.

User INDIA (the unheard): “Why can’t we just follow what the Security Experts suggest and see if it fixes the problem?”

What the management thinks

“We send thought and prayers to all our sales, you are not alone and you’ll never be. (… and thanks for the amazing party, FOOBAR, the wine was delicious!)”

What ACME did to solve the problem

Absolutely nothing.

Forecast

More SQL Injections.

 

UPDATE 20.02.2018

Many people asked me who was the ACME customer that had the SQL injection problem. None. It is an analogy to the US mass shootings that happen more and more frequently, the last one at the time of writing: https://en.wikipedia.org/wiki/Stoneman_Douglas_High_School_shooting

This post is intended to show that, if explained as it was an IT problem, the solution would sound so easy that nobody would have any doubts about the steps that must be done.

Unfortunately, it is not the case, and the US is condamned to have more and more mass shootings because nobody wants to fix the problem. 🙁