To be compatible with MySQL 5.7 or later and better support the document store, TiDB supports JSON in the latest version. In TiDB, a document is a set of Key-Value pairs, encoded as a JSON object. You can use the JSON datatype in a TiDB table and create indexes for the JSON document fields using generated columns. In this way, you can flexibly deal with the business scenarios with uncertain schema and are no longer limited by the read performance and the lack of support for transactions in traditional document databases.
The support for JSON in TiDB mainly refers to the user interface of MySQL 5.7. For example, you can create a table that includes a JSON field to store complex information:
CREATE TABLE person (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
address_info JSON
);
When you insert data into a table, you can deal with those data with uncertain schema like this:
INSERT INTO person (name, address_info) VALUES ("John", '{"city": "Beijing"}');
You can insert JSON data into the table by inserting a legal JSON string into the column corresponding to the JSON field. TiDB will then parse the text and save it in a more compact and easy-to-access binary form.
You can also convert other data type into JSON using CAST:
INSERT INTO person (name, address_info) VALUES ("John", CAST('{"city": "Beijing"}' AS JSON));
INSERT INTO person (name, address_info) VALUES ("John", CAST('123' AS JSON));
INSERT INTO person (name, address_info) VALUES ("John", CAST(123 AS JSON));
Now, if you want to query all the users living in Beijing from the table, you can simply use the following SQL statement:
SELECT id, name FROM person WHERE JSON_EXTRACT(address_info, '$.city') = 'Beijing';
TiDB supports the JSON_EXTRACT
function which is exactly the same as in MySQL. The function is to extract the city
field from the address_info
document. The second argument is a “path expression” and is used to specify which field to extract. See the following few examples to help you understand the “path expression”:
SET @person = '{"name":"John","friends":[{"name":"Forest","age":16},{"name":"Zhang San","gender":"male"}]}';
SELECT JSON_EXTRACT(@person, '$.name'); -- gets "John"
SELECT JSON_EXTRACT(@person, '$.friends[0].age'); -- gets 16
SELECT JSON_EXTRACT(@person, '$.friends[1].gender'); -- gets "male"
SELECT JSON_EXTRACT(@person, '$.friends[2].name'); -- gets NULL
In addition to inserting and querying data, TiDB also supports editing JSON. In general, TiDB currently supports the following JSON functions in MySQL 5.7:
You can get the general use of these functions directly from the function name. These functions in TiDB behave the same as in MySQL 5.7. For more information, see the JSON Functions document of MySQL 5.7. If you are a user of MySQL 5.7, you can migrate to TiDB seamlessly.
Currently TiDB does not support all the JSON functions in MySQL 5.7. You can track our progress in adding this functionality in TiDB #7546.
The full table scan is executed when you query a JSON field. When you run the EXPLAIN
statement in TiDB, the results show that it is full table scan. Then, can you index the JSON field?
First, this type of index is wrong:
CREATE TABLE person (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
address_info JSON,
KEY (address_info)
);
This is not because of technical impossibility but because the direct comparison of JSON itself is meaningless. Although we can agree on some comparison rules, such as ARRAY
is bigger than all OBJECT
, it is useless. Therefore, as what is done in MySQL 5.7, TiDB prohibits the direct creation of index on JSON field, but you can index the fields in the JSON document in the form of generated column:
CREATE TABLE person (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
address_info JSON,
city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) VIRTUAL,
KEY (city)
);
In this table, the city
column is a generated column. As the name implies, the column is generated by other columns in the table, and cannot be assigned a value when inserted or updated. For generating a column, you can specify it as VIRTUAL
to prevent it from being explicitly saved in the record, but by other columns when needed. This is particularly useful when the column is wide and you need to save storage space. With this generated column, you can create an index on it, and it looks the same with other regular columns. In query, you can run the following statements:
SELECT name, id FROM person WHERE city = 'Beijing';
In this way, you can create an index.
Note: In the JSON document, if the field in the specified path does not exist, the result of
JSON_EXTRACT
will beNULL
. The value of the generated column with index is alsoNULL
. If this is not what you want to see, you can add aNOT NULL
constraint on the generated column. In this way, when the value of thecity
field isNULL
after you insert data, it can be detected.
The current limitations of JSON and generated column are as follows:
STORED
through ALTER TABLE
.ALTER TABLE
.The above functions and some other JSON functions are under development.
What’s on this page