Wisdom of merging 100s of Oracle instances into one instance

| | August 5, 2015

Our application runs on the web, is mostly an inquiry tool, does some transactions. We host the Oracle database. The app has always had a different instance of Oracle for each customer. A customer is a company which pays us to provide our service to the company’s employees, typically 10,000-25,000 employees per customer. We intend to have several hundred customers. We do a major release every few years, and migrating to that new release is challenging: we might have a team at the customer site for a couple weeks, explaining new functionality and setting up the driving data to suit that customer.

We’re considering going multi-client, putting all our customers into a single shared Oracle 11g instance on a big honkin’ Windows Server 2008 server — in order to reduce costs. I’m wondering if that’s advisable.

There are some advantages to having separate instances for each customer. Tell me if these are bogus, please. In my rough guess about decreasing importance:

  • Our customers MyCorp and YourCo can be migrated separately when breaking changes are made to the schema. (With multi-client, we’d be migrating 300+ customers overnight!?!)

  • MyCorp’s data can be easily backed up and (!!!) restored, without affecting other customers.

  • MyCorp’s data is securely separated from their competitor YourCo’s data, without depending on developers to get the code right and/or DBAs getting the configuration right.

  • Multiple instances are lower risk, because a disaster with one customer (someone accidentally doubles everyone’s salary and the error is discovered after pay day) doesn’t affect other customers. A disaster that affected ALL our customers (whoops, new DBA, and suddenly every participant has the same SSN!?!) might put our company under.

  • Having one instance on one server presents a single point of failure, with our entire customer base out of business if a hurricane knocks the building over. Multiple instances on multiple servers permits geographic dispersion: no catastrophe will affect too large a proportion of our customers, and the unaffected servers in other regions can take on the load of the failed servers.

  • Performance is better because the database is smaller (10,000 vs 2,000,000 rows in ~50 tables).

  • If MyCorp’s offices are (mostly) in just one region, then the MyCorp’s instance can be geographically co-located there, so network lag doesn’t hurt performance. We can provide better service to global clients, for the same reason.

  • In MyCorp wants to take their database in-house, then we can easily export their instance, to get MyCorp their data.

  • Load-balancing is easier because instances can be placed on different servers (this is with a web farm).

  • When a DEV or QA instance is needed, it’s easier to clone the real instance and anonymize the data, because there’s much less data.

  • Because they’re small enough, developers can have their own instance running locally, so they can work on code while waiting at the airport and while in-flight, without fighting VPN hassles.

Q1: What are other advantages of separate instances?

We are contemplating changing the database schema and merging all of our customers into one Oracle instance, running on one hefty server.

Here are advantages of the multi-client instance approach, most important first (my WAG). Please snipe if these are bogus:

  • Less work for the DBAs, since they only need to maintain one instance instead of hundreds. Less DBA work translates to cheaper, our main motive for this change.

  • With just one instance, the DBAs can do a better job of optimizing performance. They’ll have time to add appropriate indexes and review our SQL.

  • It will be easier for developers to debug & enhance the application, because there is only one schema and one app (there might be dozens of schema versions if there are hundreds of instances, with a different version of the app for each version of the schema). This reduces costs too. The alternative is having to start every debug session with (1) What version is this customer running and (2) Let’s struggle to recreate the corresponding development environment, code and database. (We need a Virtual Machine that includes the code AND database instance for each patch and release!)

  • Licensing Oracle is cheaper because it’s priced per server irrespective of heft (or something — I don’t know anything about the subject).

  • The database becomes a viable persistent store for web session data, because there is just one instance.

  • Some database operations are easier with one multi-client instance, like finding a participant when they’re hazy about which customer they (or their spouse, maybe) works for: all the names are in one table. Reporting across customers is straightforward.

Q2: What are other advantages of having multiple clients in one instance?

Q3: Which approach do you think is better (why)? Instance per customer, or all customers in one instance?

I’m concerned that having one multi-client instance makes migration near-impossible, and that’s a deal killer…

… unless there is a compromise solution like having two multi-client instances, the old and the new. In that case case, we would design cross-instance solutions for finding participants, reporting, etc. so customers could go from one multi-client instance to the next without anything breaking.

6 Responses to “Wisdom of merging 100s of Oracle instances into one instance”

  1. It may be worth researching salesforce, and the buzz word you’re looking for is “multi tenant architecture”

    This makes a good read:

    http://blog.dayspring-tech.com/2009/02/forcecom-multitenant-architecture-under-the-covers/

    It’s a good example because Salesforce do use an Oracle db under the covers.

  2. Unless you are using Oracle XE (the limited, free edition) having one database per server will get very expensive very quickly, even if you’re buying single core, single CPU boxes. Having several databases per server is inefficient, because each database incurs an overhead of CPU and RAM usage. Tuning is more difficult, because contention is harder to diagnose.

    So, as well as being easier to administer, a single big server ought to work out cheaper than lots of discrete little servers (no guarantees, no money back!). Make sure you buy the biggest, fastest chips you can and as much RAM as you have free slots. Those are things which give you better performance without affecting your licensing costs.

    Consider the Partitioning option, if you can afford it. This will address your concerns regarding backup and recovery, because each partition can have its own tablespace. So (given partitioning by client_id) it becomes possible to backup or restore an individual client’s data without affecting the other clients. We can even export and import individual partitions. I’m surprised by David’s observation that Partition pruning didn’t work with VPD. But I haven’t tried this combo, so I’ll take his word for it.

    The one thing you might lose from consolidation is the ability to support different clients on different versions of your application. However, this is not necessarily a bad thing. As you observe, maintaining several hundred customers will be a lot easier if you forgo individualised versions of the application. If you do need to offer some bespoke features – even if you just want to beta test some functionality with an individual client – then have a look at Edition-Based Redefinition in 11gR2: it is a really nifty feature. Also it is available for all Oracle licenses, not just Enterprise.

  3. Oracle is made to handle that kind of load.

    My Question – What do you do when you have thousand customers and say ten thousand?
    Do you still keep separate instances/schema?

    I doubt anyone will do that. I have worked earlier in a place where each client had separate database as well as a copy at a central place.

    Change management becomes a headache, you’d have to maintain a very good information about which client/company is on which database revision, schema, app version and all those things. This’d become a software in itself.

    I’d suggest to create software/design based around SaaS model, that’ll allow you easy maintenance and same database/schema for all users.

    For Reliability you can still use clustering – Oracle RAC.

  4. I’ve had to consider the same decision a few times. In our case we use MySQL, so there is no cost associated with running all customers in a separate database.

    The benefits to running all of our customers on a separate database have been great. We have a script that lets us move a customer’s entire instance to any server to balance load. The script merely copies over the database, copies over any custom files, spins up the application, and sets up our routing system to send users to the new instance. The whole process takes just a few minutes.

    Database changes can take a very long time on large mysql databases. Since all our clients have their own database we are able to keep all of our datasets small. Backups are also very fast.

    Our development instances behave the same way, so this method allows us to run a variety of database schemas simultaneously as we develop and test new features. We often work with customers to have them try out a new feature before we deploy it to the rest of our instances. The one rule that we stick to (in order to avoid a few of the drawbacks you mention), is that all clients must be within one version of each other. Maintaining more than a couple versions across clients would have a huge overhead.

    Facebook took the same approach when they started their company. Each school that they launched at had a separate database and they were able to set up new instances very quickly. The primary reason they finally consolidated their database was that they wanted to enable users to communicate between schools.

    If not for potential cost issues I would definitely encourage you to stick with the separate database approach.

  5. Good question, glad to see you are considering all the alternatives. Lots of good points but I will stick to just addressing one.

    I was the DBA for a hosted application and the developers decided to use Oracle Virtual Private Database feature for this.

    The application was constructed with intention of customers sharing a pool of app servers for load balancing and a single database schema on the back end.

    Before VPD we had a Java class that tacked “where customer_id=?” or “and customer_id=?” on every query right before it went to the database so the customer would only see their data. To implement this in VPD upon login ot the DB we would have the app set a variable in the app context that would be used by the VPD policies to allow the session to only see their records. So yeah, you have to code it up right and assign VPD policies to tables, and also trust that Oracle holds up their end of the bargain.

    So was it good for us? In theory it was nice to offload the SQL predicate handling to something outside our application but in practice the advantages didn’t outweight the disadvantages.

    • When we had dozens of clients in one database and when we upgraded they all had to get upgraded at the same time. We had lots of tug-of-wars with customers that didn’t want to upgrade for whatever reason or wanted to do their own QA on the new versions.

    • We entertained the Old instance/New instance thing for upgrades but migrating data was risky and associated downtime did not make customers happy. We did roll our own procedure that would step through tables and export data… But certainly not as easy as a quickie Export or Data Pump job.

    • We also had issues with VPD predicate analysis when it came to Partitioning. As with alot of Oracle features they may work OK on their own but once you combine with other features things get unpredictable. For us partitions not related to the current customer_id weren’t getting eliminated because the predicate analysis was coming too late in the processing of the SQL statement. We worked around it by changing from static to dynamic VPD policies but our time spent parsing shot up.

    So after all that what is my take on it? I would have spent the time making sure our app made good use of bind variables and continued with the old mechanism that added customer_id to the SQL statement.

  6. When you say ‘separate instances’, are you talking about one instance with multiple schemas on it? Or do you really mean multiple instances running on a single machine? There is little reason to run multiple instances on a single machine, as opposed to running multiple schemas on a single instance – each schema would still have their own set of tables, indexes, etc.

    Anyways, I don’t have a full answer, but one thing to keep in mind is the licensing costs of Oracle, and how that can affect what the optimal solution is.

    According to the Oracle store,

    • Oracle standard edition one is $5,800.00 / Processor (where on x86, a processor is a socket, and you can go to up 2 sockets)
    • Oracle standard edition is $17,500.00 / Processor (where on x86, a processor is a socket, and you can go to up 4 sockets)
    • Oracle enterprise edition is $47,500.00 / Processor (where on x86, a processor is 2 CORES – so you have to effectively double that price for quad core CPUs)

    So if, for example, you need 8 quad core CPUs to handle 100 customers, licensing that on a single database is VASTLY more expensive than having 4 separate databases, each having 2 quad core CPUs, each running 25 customers.

    8 quad core CPUs requires enterprise edition, and would have a list price of 16 x $47,500 = $760,000. 4 machines, each running standard edition one, and each with 2 quad-core CPUS, would have a list price of 8 x $5,800.00 = $46,400 – a factor of 16 difference. Now, keep in mind that no one pays list price for enterprise edition, but there is still a huge difference to consider.

    If you don’t have a huge need for database operations across clients, and you don’t need enterprise edition features, and you need this level of CPU power (or expect to grow to need this level of CPU power), the licensing costs are going to be a huge downside of the one-instance approach.

Leave a Reply