Clustering, in the context of databases, refers to the ability of several servers or instances to connect to a single database. An instance is the collection of memory and processes that interacts with a database, which is the set of physical files that actually store data.
Clustering offers two major advantages, especially in high-volume database environments:
- Fault tolerance: Because there is more than one server or instance for users to connect to, clustering offers an alternative, in the event of individual server failure.
- Load balancing: The clustering feature is usually set up to allow users to be automatically allocated to the server with the least load.
Partitioning is the database process where very large tables are divided into multiple smaller parts. By splitting a large table into smaller, individual tables, queries that access only a fraction of the data can run faster because there is less data to scan.
Partitioning Methods:
- Range Partitioning: Range partitioning maps data to partitions based on ranges of partition key values that establish for each partition. It is the most common type of partitioning and is often used with dates
- Hash Partitioning: Hash partitioning maps data to partitions based on a hashing algorithm that DBMS applies to a partitioning key that identify. The hashing algorithm evenly distributes rows among partitions, giving partitions approximately the same size. Hash partitioning is the ideal method for distributing data evenly across devices. Hash partitioning is also an easy-to-use alternative to range partitioning, especially when the data to be partitioned is not historical.
- List Partitioning: List partitioning enables dbms to explicitly control how rows map to partitions. Can do this by specifying a list of discrete values for the partitioning column in the description for each partition. This is different from range partitioning, where a range of values is associated with a partition and with hash partitioning, where query have no control of the row-to-partition mapping. The advantage of list partitioning is that one can group and organize unordered and unrelated sets of data in a natural way. The following example creates a list partitioned table grouping states according to their sales regions
- Composite Partitioning: Composite partitioning combines range and hash or list partitioning. DBMS first distributes data into partitions according to boundaries established by the partition ranges. Then, for range-hash partitioning, DBMS uses a hashing algorithm to further divide the data into sub partitions within each range partition. For range-list partitioning, DBMS divides the data into sub partitions within each range partition based on the explicit list to chose
- Index Partitioning: Can choose whether or not to inherit the partitioning strategy of the underlying tables and can create both local and global indexes on a table partitioned by range, hash, or composite methods. Local indexes inherit the partitioning attributes of their related tables.