Mysql JSON - SELECT, INSERT, UPDATE, 색인

Recommanded Free YOUTUBE Lecture: <% selectedImage[1] %>

yundream
2023-05-04
2020-03-04
68811

## MySQL 테스트 환경
Docker로 실행했다.
```shell
$ docker run -e MYSQL_ROOT_PASSWORD=1234  --name json_mysql mysql
```

Mysql 버전은 (2020년 3월 현재 최신버전)8.0.19다.
```shell
$ mysql -u root -p -h 172.17.0.2
Enter password: 
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 8.0.30 MySQL Community Server - GPL

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 
```

## MySQL JSON Data type
Mysql 5.7 부터 Json 데이터 타입을 지원한다. Json 데이터 타입을 저장하기 위한 칼럼은 **JSON**으로 설정하면 된다. 테스트를 위한 데이터베이스를 만들었다.
```
mysql> create database json_test;
Query OK, 1 row affected (0.01 sec)

mysql> use json_test;
Database changed
```

테스트 테이블을 만들었다.

```sql
CREATE TABLE users ( 
    id    INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    email VARCHAR(64) NOT NULL,    
    info JSON
);

```

## INSERT
값을 입력(insert) 해보자.
```sql
INSERT INTO  
    users 
SET
    email = 'yundream@gmail.com',
    info = '{ 
    "name": "yundream",
    "age":45,
    "address1":"seoul",
    "address2":"gan-nam",
    "tel":"010-000-0000"  
}';
```

```sql
INSERT INTO users 
SET email='foo@joinc.co.kr', info=JSON_OBJECT("age", 28, "tel","000-000-0000", "name","foobar", "address1", "busan");
```

## MySQL JSON Path 표현식
MySQL은 JSON 데이터를 처리하는 쿼리를 쉽게 사용하기 위한 **JSON path 표현식(expressions)** 를 제공한다. 아래 쿼리를 보자.
```sql
SELECT * FROM users WHERE info->'$.name' = 'yundream';
```
쿼리에서 사용한 오른쪽 화살표(->)로 필드와 경로식을 구분한다. info 필드에 있는 JSON 데이터에서 key 이름이 **name** 요소를 가리킨다.
```
info->'$.name' 문은 json_extract(info, '$.name') 과 동일하다.
```

```sql
SELECT * FROM users WHERE json_extract(info, '$.name') = 'yundream';
```

## SELECT
SELECT와 JSON Path를 이용해서 JSON 데이터에 접근 할 수 있다.
```sql
SELECT * FROM users;
```
| id   | email                    | info                                                                                                      |
| ---- | ------------------------ | --------------------------------------------------------------------------------------------------------- |
| 1    | yundream@gmail.com       | {"age": 45, "tel": "010-000-0000", "name": "yundream", "address1": "seoul", "address2": "gan-nam"}        |
| 2    | hong.chang@example.com   | {"age": 41, "tel": "010-111-3279", "name": "kim hong chang", "address1": "kwang-ju"}                      |
| 3    | star.jang@example.com    | {"age": 37, "tel": "010-111-3279", "name": "Jang hwa", "address1": "seoul", "address2": "sam-sung"}       |
| 4    | namu.kim@example.com     | {"age": 38, "tel": "010-222-2222", "name": "KIM gil dong", "address1": "seoul", "address2": "sam-sung"}   |
| 5    | hkk.john@example.com     | {"age": 38, "tel": "010-222-2222", "name": "john my", "address1": "seoul", "address2": "seo-cho"}         |

JSON 데이터에 있는 Key 는 **path 연산자(->)** 로 읽을 수 있다.
```sql
SELECT id, email, info->'$.name', info->'$.age' 
FROM users;
```
결과는 아래와 같다.
| id  | email                  | info->'$.name'   | info->'$.age' |
| --- | ---------------------- | ---------------- | ------------- |
| 1   | yundream@gmail.com     | "yundream"       | "45"          |
| 2   | hong.chang@example.com | "kim hong chang" | "41"          |
| 3   | star.jang@example.com  | "Jang hwa"       | "37"          |
| 4   | namu.kim@example.com   | "KIM gil dong"   | "38"          |
| 5   | hkk.john@example.com   | "john my"        | "38"          |

```sql
SELECT * 
FROM users 
WHERE info->"$.age" = 38;
```
| id | email                | info                                                                                                    |
|----|----------------------|---------------------------------------------------------------------------------------------------------|
|  4 | namu.kim@example.com | {"age": 38, "tel": "010-222-2222", "name": "KIM gil dong", "address1": "seoul", "address2": "sam-sung"} |
|  5 | hkk.john@example.com | {"age": 38, "tel": "010-222-2222", "name": "john my", "address1": "seoul", "address2": "seo-cho"}       |

조회는 할 수 있지만 색인이 걸려있지 않기 때문에 효율은 엄청나게 떨어질 것이다. 색인을 걸어야 할 건데, 이건 JSON 색인 편에서 살펴보도록 하자.
## UPDATE
MySQL은 JOSN의 partial UPDATE를 지원한다. UPDATE는 JSON_SET(), JSON_REPLACE(), JSON_REMOVE() 3개의 함수를 이용해서 수행 할 수 있다.
* JSON_SET(): 기존 값을 업데이트하고 존재하지 않을 경우 추가한다.
* JSON_REPLACE(): 기존 값만 바꾼다.
* JSON_REMOVE(): 값을 삭제한다. 이때 값은 NULL이 된다.

문법은 아래와 같다.
```sql
UPDATE table_name SET json_col = JSON_SET(json_col, '$.name', UPPER(json_col->>'$.name')),
             json_col = JSON_REPLACE(json_col, '$.age', int_col),
             json_col = JSON_REMOVE(json_col, '$.address'),
             int_col = int_col + 1;

UPDATE table_name SET json_col = JSON_REPLACE(JSON_REMOVE(json_col, '$.address'),
                                     '$.name', UPPER(json_col->>'$.name'),
                                     '$.age', int_col),
             int_col = int_col + 1;
```

1인 유저의 info.age 를 46으로 업데이트 했다.
```sql
UPDATE users 
SET info = JSON_SET(info, "$.age", 46) 
WHERE id =1;
```

```sql
SELECT * from users where id = 1;
```

| id | email              | info                                                                                               |
|----|--------------------|----------------------------------------------------------------------------------------------------|
|  1 | yundream@gmail.com | {"age": 46, "tel": "010-000-0000", "name": "yundream", "address1": "seoul", "address2": "gan-nam"} |

## JSON 함수들
자주 사용하는 함수들만 설명한다. 전체 함수 목록은 [MySQL 5.7 JSON Function Reference](https://dev.mysql.com/doc/refman/5.7/en/json-function-reference.html)문서를 참고하자.
#### JSON_ARRAY
값을 평가해서, JSON 배열을 반환한다.
```sql
SELECT JSON_ARRAY("apple", "car", "tv", "ios");
```
| JSON_ARRAY("apple", "car", "tv", "ios") |  |
|-----------------------------------------|--|
| ["apple", "car", "tv", "ios"]           |  |

아래와 같은 데이터가 있다고 가정해보자.

```sql
SELECT * 
FROM goods;
```
| name                 | category | price   |
|----------------------|----------|---------|
| ryzen 3700x          | pc       |  542000 |
| ryzen 2700x          | pc       |  250000 |
| MSI AIM-i5           | laptop   |  658000 |
| GRAM15 15ZD90N-VX50K | laptop   | 1410000 |

JSON_ARRAY를 이용해서 JSON 배열로 리턴 할 수 있다.

```sql
SELECT JSON_ARRAY(name, price, category) 
FROM goods;
```
| JSON_ARRAY(name, price, category)           | |
|---------------------------------------------|-|
| ["ryzen 3700x", 542000, "pc"]               | |
| ["ryzen 2700x", 250000, "pc"]               | |
| ["MSI AIM-i5", 658000, "laptop"]            | |
| ["GRAM15 15ZD90N-VX50K", 1410000, "laptop"] | |

#### JSON_EXTRACT
JSON 경로와 일치하는 데이터를 리턴한다.
```sql
SET @j='[10, 20, [30, 40]]';
SELECT JSON_EXTRACT(@j, '$[0]');
```

| JSON_EXTRACT(@j, '$[0]') | |
|--------------------------|-|
| 10                       | |

```sql
SELECT JSON_EXTRACT(@j, '$[1]');
```
| JSON_EXTRACT(@j, '$[1]') |  |
|--------------------------|--|
| 20                       |  |

#### JSON_OBJECT
이 함수는 SELECT 결과를 JSON 형태로 변환해서 받아보고 싶을 때 사용 할 수 있다.

```sql
SELECT JSON_OBJECT("name","yundream", "age", 25, "address", "seoul");
```

| JSON_OBJECT("name","yundream", "age", 25, "address", "seoul") | |
|---------------------------------------------------------------|-|
| {"age": 25, "name": "yundream", "address": "seoul"}           | |

아래와 같이 JSON으로 리턴할 수 있다.
```sql
SELECT JSON_OBJECT('name',name, 'category',category, 'price', price) 
FROM goods;
```

| JSON_OBJECT('name',name, 'category',category, 'price', price)            | |
|--------------------------------------------------------------------------|-|
| {"name": "ryzen 3700x", "price": 542000, "category": "pc"}               | |
| {"name": "ryzen 2700x", "price": 250000, "category": "pc"}               | |
| {"name": "MSI AIM-i5", "price": 658000, "category": "laptop"}            | |
| {"name": "GRAM15 15ZD90N-VX50K", "price": 1410000, "category": "laptop"} | |

아래와 같이 INSERT 문에도 사용 할 수 있다.
```sql
INSERT INTO users 
SET email='foo@joinc.co.kr', info=JSON_OBJECT("age", 28, "tel","000-000-0000", "name","foobar", "address1", "busan");
```

#### JSON_SET

JSON_SET은 JSON 문서에 필드를 추가하거나 업데이트하기 위해서 사용한다. 같은 필드가 있을 경우 업데이트하고, 새로운 필드는 insert 한다.
```sql
SET @j=JSON_OBJECT("a", 1, "b", JSON_ARRAY(2,3));
SELECT  @j;
```
| @j                    |  |
|-----------------------|--|
| {"a": 1, "b": [2, 3]} |  |

```sql
SELECT JSON_SET(@j, '$.a', 10, '$.c', '[true, false]');
```
| JSON_SET(@j, '$.a', 10, '$.c', '[true, false]') | |
|-------------------------------------------------|-|
| {"a": 10, "b": [2, 3], "c": "[true, false]"}    | |

저장된 레코드의 JSON 문서의 데이터를 업데이트 해보자.
```sql
UPDATE users 
SET info=JSON_SET(info, "$.address1", "seoul", "$.address2", "sin-sa") 
WHERE id = 2;
```

```sql
SELECT * 
FROM users 
WHERE id = 2;
```
| id | email                  | info                                                                                                      |
|----|------------------------|-----------------------------------------------------------------------------------------------------------|
|  2 | hong.chang@example.com | {"age": "41", "tel": "010-111-3279", "name": "kim hong chang", "address1": "seoul", "address2": "sin-sa"} |

#### JSON_INSERT
같은 필드에 대한 값을 변경하지 않는다. 새로운 필드만 추가한다.

```sql
SELECT @j;
```
| @j                    | |
|-----------------------|-|
| {"a": 1, "b": [2, 3]} | |

```sql
SELECT JSON_INSERT(@j, '$.a', 10, '$.c', '[treu, false]');
```
| JSON_INSERT(@j, '$.a', 10, '$.c', '[treu, false]') |     |
| -------------------------------------------------- | --- |
| {"a": 1, "b": [2, 3], "c": "[treu, false]"}        |     |

"a" 필드의 값은 업데이트되지 않고, "c"만 추가된걸 확인 할 수 있다.

#### JSON_REPLACE
같은 필드가 존재할 경우 변경한다. 존재하지 않는 필드는 무시한다.

```sql
SELECT @j;
```

| @j                    | |
|-----------------------|-|
| {"a": 1, "b": [2, 3]} | |

```sql
SELECT JSON_REPLACE(@j, '$.a', 10, '$.c', '[treu, false]');
```

| JSON_REPLACE(@j, '$.a', 10, '$.c', '[treu, false]') | |
|-----------------------------------------------------|-|
| {"a": 10, "b": [2, 3]}                              | |

#### JSON_VALID
JSON 문서 형식이 올바른지 검사한다.

```sql
SELECT JSON_VALID('{"a": 1}'), JSON_VALID('hello');
```

| JSON_VALID('{"a": 1}') | JSON_VALID('hello') |
|------------------------|---------------------|
|                      1 |                   0 |

## JSON 색인
아래 SQL문을 보자.
```sql
select * from users where JSON_EXTRACT(info, "$.age") = 38;
```

잘 작동하지만 색인이 걸려있지 않기 때문에 비효율적으로 작동 할 것이다. **EXPLAIN**을 이용해서 쿼리를 측정해보자.

```sql
EXPLAIN SELECT * FROM users WHERE info->"$.age" = 38;
```
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
|----|-------------|-------|------------|------|---------------|------|---------|------|------|----------|-------------|
|  1 | SIMPLE      | users | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | Using where |

**type**필드는 테이블에서 행을 어떻게 찾았는지를 알려준다. **ALL**은 SQL문을 수행하기 위해서 전체 레코드를 전부 스캔했음을 의미한다. 실제 서비스에서 SQL문을 사용해서는 안될 것이다.

MYSQL 5.7.14 부터는 JSON 컬럼의 Key에 대해서도 색인을 만들 수 있다. 테이블을 다시 만들었다.
```sql
CREATE TABLE users (
    id    INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    email VARCHAR(64) NOT NULL,
    info JSON,                                          
    age_virtual INT GENERATED ALWAYS AS (info->'$.age')          
);                                                       
```

CREATE TABLE로 테이블을 만들 때, 아래와 같이 컬럼을 추가 할 수 있다.
```sql
`column_name` datatype GENERATED ALWAYS AS (expression)
```

여기에서 핵심은 **GENERATED ALWAYS**와 **AS**다. GENERATED ALWAYS라는 문구는 선택사항이다. 테이블 열이 generated된 컬럼임을 명시적으로 나타내려는 경우에만 필요하다. 필요한 것은 생성된 컬럼에 원하는 값을 리턴하는 **AS**문이다.

```sql
`age_virtual` INT GENERATED ALWAYS AS (info->'$.age') 
```

나는 age_virtual 이라는 INT 데이터를 저장하는 컬럼을 만들었다. 이 컬럼에는 info컬럼의 age에 접근한다는 걸 알려주기 위해서 info->'$.age' 문을 사용했다. 이제 age_virtual 컬럼은 info->'$.age'를 가리키게 된다. 테이블 구성을 살펴보자.

```sql
DESC users;
```

| Field       | Type        | Null | Key | Default | Extra             |
|-------------|-------------|------|-----|---------|-------------------|
| id          | int(11)     | NO   | PRI | NULL    | auto_increment    |
| email       | varchar(64) | NO   |     | NULL    |                   |
| info        | json        | YES  |     | NULL    |                   |
| age_virtual | int(11)     | YES  |     | NULL    | VIRTUAL GENERATED |
이제 **CREATE INDEX**문을 이용해서 age_virtual 컬럼을 색인해보자.

```sql
CREATE INDEX age_idx ON users(age_virtual);
```

EXPLAN 문을 이용해서 쿼리를 평가해보자.
```sql
EXPLAIN select * 
FROM users 
WHERE info->"$.age" = 38;
```
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
|----|-------------|-------|------------|------|---------------|---------|---------|-------|------|----------|-------|
|  1 | SIMPLE      | users | NULL       | ref  | age_idx       | age_idx | 5       | const |    2 |   100.00 | NULL  |

모든 row를 스캔하는 대신에 age_idx 인덱스를 사용해서 2개의 행에 대해서만 쿼리가 수행 된 것을 확인 할 수 있다. 데이터베이스가 운영 중이라면 **ALTER**를 이용해서 색인을 추가 할 수 있다. info.name을 색인해 보자.

```sql
ALTER TABLE users ADD COLUMN name_virtual VARCHAR(64) GENERATED ALWAYS AS (info->'$.name') NOT NULL;
```
인덱스를 만들어보자.
```sql
CREATE INDEX `name_idx` ON users(name_virtual);
```

색인이 적용되는지 쿼리를 평가해보자.
```sql
EXPLAIN select * from users where name_virtual = 'yundream';
```

| id | select_type | table | partitions | type | possible_keys | key      | key_len | ref   | rows | filtered | Extra |
|----|-------------|-------|------------|------|---------------|----------|---------|-------|------|----------|-------|
|  1 | SIMPLE      | users | NULL       | ref  | name_idx      | name_idx | 258     | const |    1 |   100.00 | NULL  |

#### JSON_STORAGE_SIZE & JSON_STORAGE_FREE
**JSON_STORAGE_SIZE**은 JSON 문서를 저장하는데 사용된 바이트 수를 반환한다.
```sql
SELECT JSON_STORAGE_SIZE(info) AS Size 
FROM users;
```

| Size |   |
|------| - |
|  102 | |
|   89 | |
|  104 | |
|  108 | |
|  102 | |

보통은 sum, avg 등의 함수와 같이 사용한다.
```sql
SELECT sum(JSON_STORAGE_SIZE(info)) AS TotalSize, avg(JSON_STORAGE_SIZE(info)) AS AVGSize 
FROM users;
```

| TotalSize | AVGSize  |
|-----------|----------|
|       505 | 101.0000 |

## 정리
- SQL을 사용 할 수 있다는 것은 엄청난 장점이다.
- PostGresql도 JSON 데이터 타입을 지원한다. 유연성을 확보한 건데, 이렇게 되면 "수평적 확장"이 핵심 기술이 아닌 한 MySQL 써도 될 거 같다.
- MySQL은 운영/관리의 장점도 있다.

## 참고
- [MySQL :: MYSQL 5.7 Reference Manual](https://dev.mysql.com/doc/refman/5.7/en/json-function-reference.html)
- [MySQL for JSON: Generated Columns and Indexing](https://www.compose.com/articles/mysql-for-json-generated-columns-and-indexing/)

Search For:

BY TAGS

Recent Posts

Archive Posts

Tags

About

Get in Touch

Categories