Efficiently Pick the Correct Database for Your Applications

Creator: Leitao Guo (Database and middleware manager at iQIYI)

Database selection criteria

Finding the factual database resolution for your application is now now not straightforward. At iQIYI, one of the ideal online video websites on the planet, we’re skilled in database preference across diverse fields: On-line Transactional Processing (OLTP), On-line Analytical Processing (OLAP), Hybrid Transaction/Analytical Processing (HTAP), SQL, and NoSQL.

Today, I would possibly fragment with you:

I’m hoping this post indicate you can with out issues fetch the factual database for your applications.

Database preference standards

When deciding on a database, totally different other folks use totally different standards:

  • Database procurement crew pay extra consideration to snatch costs, alongside with storage and community requirements.

  • Database administrators (DBAs) care about:

    • Operation and repairs costs:

      • A first rate monitoring and alerting diagram
      • Toughen for backup and restore
      • Cheap make stronger and migration costs
      • An brisk make stronger community
      • Ease of performance tuning
      • Ease of troubleshooting
    • Service balance:

      • Toughen for diverse records replicas
      • Extremely on hand providers
      • Toughen for diverse writes and multi-spirited structure
    • Performance:

      • Latency
      • Queries per second (QPS)
      • Whether it helps extra evolved hierarchical storage aspects
    • Scalability: Whether or now now not it is straightforward to scale horizontally and vertically

    • Security: Whether it meets audit requirements and prevents SQL injections and records leakage

  • Utility developers care about:

    • Steady providers
    • Performance
    • Scalability
    • Ease of building database interface
    • Ease of editing the database schema

What databases we use at iQIYI

At iQIYI, we basically use these databases:

  • MySQL
  • TiDB
  • Redis
  • Couchbase
  • Huge records analytical methods, adore Hive and Impala
  • Various databases, adore MongoDB, HiGraph, and HiKV

Because there are the form of range of forms of databases at iQIYI, application developers would possibly per chance per chance well simply now now not know which database is upright for his or her application misfortune. Therefore, we labeled these databases by application misfortune and database interface, and we constructed a matrix:

  • The X-axis represents application scenarios: OLTP vs. OLAP.
  • The Y-axis represents database interfaces: SQL vs. NoSQL.

All databases at iQIYI

All databases at iQIYI

This matrix has these characteristics:

  • On the left

    • Within the greater left corner

      Databases make stronger OLTP workloads and the SQL language. As an instance, MySQL helps totally different transaction isolation ranges, high QPS, and low latency. We basically use it to store transaction recordsdata and extreme records, comparable to orders and VIP recordsdata.

    • Within the decrease left corner

      We use NoSQL databases to optimize special scenarios. Veritably, these databases bear straightforward schemas or they are schemaless with high throughput and low latency. We basically use them as caches or key-rate (KV) databases.

  • On the factual

    All are OLAP massive records analytical methods, comparable to ClickHouse and Impala. Veritably, they make stronger the SQL language and produce now now not make stronger transactions. They’ve actual scalability and lengthy response latency. We can add machines to enhance records storage capacity, and the response prolong is longer.

  • Around the two axes’ meeting point

    These databases are just, and we call them HTAP databases, comparable to TiDB. When the quantity of recordsdata is miniature, they bear got actual performance. When the records dimension is natty or the queries are complex, their performance is now now not unsuitable. Veritably, to meet totally different application wants, we use totally different storage engines and request engines.

Practical resolution trees for efficiently deciding on a database

I’d adore to indicate our database preference trees. We developed these trees based totally on our DBAs’ and application developers’ journey.

efficiently decide a relational database

Whenever you gain a relational database, you can:

  1. Enjoy in mind your records quantity and database scalability.

  2. Resolve based totally on:

    • Whether the database has a cold backup diagram
    • Whether to utilize the TokuDB storage engine
    • Whether to utilize a proxy

Efficiently choosing a relational database

Efficiently deciding on a relational database

efficiently decide a NoSQL database

After we decide a NoSQL database, we should always place in mind many components to mediate whether to utilize the foremost-secondary framework, client sharding, distributed cluster, Couchbase, or HiKV.

Efficiently choosing a NoSQL database

Efficiently deciding on a NoSQL database

Pointers for deciding on a database

I’d adore to fragment with you some pointers for deciding on a database:

  • Try to therapy the issues with out changing the database first. You would resolve your requirements based totally on records quantity, QPS, and latency, but are these the exact requirements? Can you fetch a blueprint to keep away with this requirement with out fascinating the database? As an instance, if the records quantity is high, you can encode or compress records first, and that would possibly per chance per chance well simply decrease the records dimension. Produce now now not push your total requirements correct down to the database stage.
  • Enjoy in mind what the exact motive is for deciding on a database. Produce you gain it because or now now not it is fashioned? Or because or now now not it is evolved? The ideal request is: can it in point of fact therapy your dispute? As an instance, in case your records quantity is now now not very natty, you produce now now not need a tool with a quantity of storage.
  • Bid in moderation before you stop on a resolution. Are you leaving on the support of a tool since it would now not work? Or since you are now now not utilizing it smartly? It be hard to earn rid of your resolution, so be determined about why you are giving up on it. As an instance, before you produce a resolution, overview your TPC-C or Sysbench benchmarks.
  • Withhold an valid attitude in direction of self-type. Whenever you bear gotten to create your bear database, you can consult with and use some faded products. Produce now now not manufacture things from scratch at the same time as you occur to produce now now not need to.
  • Contain start-provide products. As an instance, TiDB is an start-provide, distributed SQL database. It has an spirited community, and at this time has 26,000 stars on GitHub. Our outdated post described how TiDB helped us scale out our database and assemble high availability. At this time, in the production atmosphere, we now bear got 88 TiDB clusters, with extra than 1,200 nodes. There’s no motive why or now now not it is a need to to lumber it alone.

Back to top button