Mysql – Separate tables or partition one huge table

database-designMySQLpartitioningwindows

I'm building a rather large table that includes a particular column, my_column. There are a small-ish number of possible values for my_column, and each unique value of my_column should have the same number of rows associated with it. Furthermore, it is likely that queries will only be focused on a single, specific value of my_column at a time. I see that one option would be to partition this large table based upon the values of my_column. The other option as I see it would be to create a separate table entirely such that each table associates with exactly one value of my_column. I'd like to know what would lead me to pick one option over the other.

As far as I understand it, one large table that is partitioned seems like the "correct" or "pure" way to do it, as that seems to be a primary use-case for partitions. However, I suspect that for practical reasons the second option is better, and that's what I'm leaning towards. I don't know if it matters, but I'm dealing with MyISAM tables.

Best Answer

I have an suggestion regarding your second option.

If you need to keep the MyISAM tables with each unique my_column value separated from other MyISAM tables, you may want to look into the MERGE (Mrg_MyISAM) Storage Engine as an alternative to table partitioning.

This will allow your multiple MyISAM tables within the same database that have identical table structures and index layout to be mapped together in such a way that single query hits all MyISAM tables.

Suppose you have a table as follows:

CREATE TABLE tb1
(
    my_column INT NOT NULL
    id INT NOT NULL AUTO_INCREMENT,
    name VARCHAR(30),
    PRIMARY KEY (my_column,id),
    KEY name (my_column,name)
) ENGINE=MyISAM;

and you have 3 such tables (tb1, tb2, tb3) where

  • my_column for tb1 = 1
  • my_column for tb2 = 2
  • my_column for tb3 = 3

You can merge them under a single table mapping as follows:

CREATE TABLE tbmerge LIKE tb1;
ALTER TABLE tbmerge ENGINE=Mrg_MyISAM UNION=(tb1,tb2,tb3);

To perform a search through all the tables, just use tbmerge. For example, suppose you want to see every name from tb2 that starts with 'Jack', you run this query:

SELECT name FROM tbmerge WHERE my_column=2 AND name LIKE 'Jack%';

Given the design of the table, you should always specify the value for my_column. In fact, for every index tb1 has, make sure my_column is always the first column. The reason? A query against tbmerge is always a query against tb1, tb2, tb3 (all underlying tables). Otherwise, this query

SELECT name FROM tbmerge WHERE name LIKE 'Jack%';

will experience horrible performance because it will perform table scans against all underlying tables. Please plan carefully the indexes you will be using, following that simple rule (using my_column as the first column of every index)

There is an additional benefit: You can INSERT into the underlying tables at your convenience, thus working with the

I discussed this using MERGE tables in an earlier post (Jan 4, 2012).