PingCAP
  • Docs
  • Success Stories
  • Blog
  • About
  • Free Consultation
PingCAP
  • Docs
  • Success Stories
  • Blog
  • About
  • Free Consultation

Contact

中文
Documentation
  • About TiDB
    • TiDB Introduction
    • TiDB Architecture
  • Quick Start
    • TiDB Quick Start Guide
    • Basic SQL Statements
    • Bikeshare Example Database
  • TiDB User Guide
    • TiDB Server Administration
      • The TiDB Server
      • The TiDB Command Options
      • The TiDB Data Directory
      • The TiDB System Database
      • The TiDB System Variables
      • The Proprietary System Variables and Syntax in TiDB
      • The TiDB Server Logs
      • The TiDB Access Privilege System
      • TiDB User Account Management
      • Use Encrypted Connections
    • SQL Optimization
      • Understand the Query Execution Plan
      • Introduction to Statistics
    • Language Structure
      • Literal Values
      • Schema Object Names
      • Keywords and Reserved Words
      • User-Defined Variables
      • Expression Syntax
      • Comment Syntax
    • Globalization
      • Character Set Support
      • Character Set Configuration
      • Time Zone Support
    • Data Types
      • Numeric Types
      • Date and Time Types
      • String Types
      • JSON Types
      • The ENUM data type
      • The SET Type
      • Data Type Default Values
    • Functions and Operators
      • Function and Operator Reference
      • Type Conversion in Expression Evaluation
      • Operators
      • Control Flow Functions
      • String Functions
      • Numeric Functions and Operators
      • Date and Time Functions
      • Bit Functions and Operators
      • Cast Functions and Operators
      • Encryption and Compression Functions
      • Information Functions
      • JSON Functions
      • Aggregate (GROUP BY) Functions
      • Miscellaneous Functions
      • Precision Math
    • SQL Statement Syntax
      • Data Definition Statements
      • Data Manipulation Statements
      • Transactions
      • Database Administration Statements
      • Prepared SQL Statement Syntax
      • Utility Statements
      • TiDB SQL Syntax Diagram
    • JSON Functions and Generated Column
    • Connectors and APIs
    • TiDB Transaction Isolation Levels
    • Error Codes and Troubleshooting
    • Compatibility with MySQL
    • TiDB Memory Control
    • Slow Query Log
    • Advanced Usage
      • Read Data From History Versions
      • Garbage Collection (GC)
  • TiDB Operations Guide
    • Hardware and Software Requirements
    • Deploy
      • Ansible Deployment (Recommended)
      • Offline Deployment Using Ansible
      • Docker Deployment
      • Docker Compose Deployment
      • Cross-Region Deployment
      • Kubernetes Deployment
    • Configure
      • Configuration Flags
      • Configuration File Description
      • Modify Component Configuration Using Ansible
      • Enable TLS Authentication
      • Generate Self-signed Certificates
    • Monitor
      • Overview of the Monitoring Framework
      • Key Metrics
      • Monitor a TiDB Cluster
    • Scale
      • Scale a TiDB Cluster
      • Scale Using Ansible
    • Upgrade
      • Upgrade the Component Version
      • TiDB 2.0 Upgrade Guide
    • Tune Performance
    • Backup and Migrate
      • Backup and Restore
      • Migrate
        • Migration Overview
        • Migrate All the Data
        • Migrate the Data Incrementally
    • TiDB-Ansible Common Operations
    • Troubleshoot
  • TiDB Enterprise Tools
    • Syncer
    • mydumper
    • Loader
    • TiDB-Binlog
    • PD Control
    • PD Recover
    • TiKV Control
    • TiDB Controller
  • TiKV Documentation
  • TiSpark Documentation
    • Quick Start Guide
    • User Guide
  • Frequently Asked Questions (FAQ)
  • TiDB Best Practices
  • Releases
    • 2.1 RC3
    • 2.1 RC2
    • 2.0.7
    • 2.1 RC1
    • 2.0.6
    • 2.0.5
    • 2.1 Beta
    • 2.0.4
    • 2.0.3
    • 2.0.2
    • 2.0.1
    • 2.0
    • 2.0 RC5
    • 2.0 RC4
    • 2.0 RC3
    • 2.0 RC1
    • 1.1 Beta
    • 1.0.8
    • 1.0.7
    • 1.1 Alpha
    • 1.0.6
    • 1.0.5
    • 1.0.4
    • 1.0.3
    • 1.0.2
    • 1.0.1
    • 1.0
    • Pre-GA
    • RC4
    • RC3
    • RC2
    • RC1
  • TiDB Adopters
  • TiDB Roadmap
  • Connect with us
  • More Resources
    • Frequently Used Tools
    • PingCAP Blog
    • Weekly Update

Loader Instructions

What is Loader?

Loader is a data import tool to load data to TiDB.

Download the Binary.

Why did we develop Loader?

Since tools like mysqldump will take us days to migrate massive amounts of data, we used the mydumper/myloader suite to multi-thread export and import data. During the process, we found that mydumper works well. However, as myloader lacks functions of error retry and savepoint, it is inconvenient for us to use. Therefore, we developed loader, which reads the output data files of mydumper and imports data to TiDB through the MySQL protocol.

What can Loader do?

  • Multi-thread import data

  • Support table level concurrent import and scattered hot spot write

  • Support concurrent import of a single large table and scattered hot spot write

  • Support mydumper data format

  • Support error retry

  • Support savepoint

  • Improve the speed of importing data through system variable

Usage

Note:

  • Do not import the mysql system database from the MySQL instance to the downstream TiDB instance.
  • If mydumper uses the -m parameter, the data is exported without the table structure and the loader can not import the data.
  • If you use the default checkpoint-schema parameter, after importing the data of a database, run drop database tidb_loader before you begin to import the next database.
  • It is recommended to specify the checkpoint-schema = "tidb_loader" parameter when importing data.

Parameter description

  -L string: the log level setting, which can be set as debug, info, warn, error, fatal (default: "info")

  -P int: the port of TiDB (default: 4000)

  -V boolean: prints version and exit

  -c string: config file

  -checkpoint-schema string: the database name of checkpoint. In the execution process, loader will constantly update this database. After recovering from an interruption, loader will get the process of the last run through this database. (default: "tidb_loader")

  -d string: the storage directory of data that need to import (default: "./")

  -h string: the host of TiDB (default: "127.0.0.1")

  -p string: the account and password of TiDB

  -pprof-addr string: the pprof address of Loader. It tunes the performance of Loader (default: ":10084")

  -t int: the number of thread,increase this as TiKV nodes increase (default: 16)

  -u string: the user name of TiDB (default: "root")

Configuration file

Apart from command line parameters, you can also use configuration files. The format is shown as below:

# Loader log level, which can be set as "debug", "info", "warn", "error" and "fatal" (default: "info")
log-level = "info"

# Loader log file
log-file = "loader.log"

# Directory of the dump to import (default: "./")
dir = "./"

# Loader pprof address, used to tune the performance of Loader (default: "127.0.0.1:10084")
pprof-addr = "127.0.0.1:10084"

# The checkpoint data is saved to TiDB, and the schema name is defined here.
checkpoint-schema = "tidb_loader"

# Number of threads restoring concurrently for worker pool (default: 16). Each worker restore one file at a time.
pool-size = 16

# The target database information
[db]
host = "127.0.0.1"
user = "root"
password = ""
port = 4000

# The sharding synchronising rules support wildcharacter.
# 1. The asterisk character (*, also called "star") matches zero or more characters,
#    for example, "doc*" matches "doc" and "document" but not "dodo";
#    asterisk character must be in the end of the wildcard word,
#    and there is only one asterisk in one wildcard word.
# 2. The question mark '?' matches exactly one character.
#    [[route-rules]]
#    pattern-schema = "shard_db_*"
#    pattern-table = "shard_table_*"
#    target-schema = "shard_db"
#    target-table = "shard_table"

Usage example

Command line parameter:

./bin/loader -d ./test -h 127.0.0.1 -u root -P 4000

Or use configuration file “config.toml”:

./bin/loader -c=config.toml

FAQ

The scenario of synchronising data from sharded tables

Loader supports importing data from sharded tables into one table within one database according to the route-rules. Before synchronising, check the following items:

  • Whether the sharding rules can be represented using the route-rules syntax.
  • Whether the sharded tables contain monotone increasing primary keys, or whether there are conflicts in the unique indexes or the primary keys after the combination.

To combine tables, start the route-rules parameter in the configuration file of Loader:

  • To use the table combination function, it is required to fill the pattern-schema and target-schema.
  • If the pattern-table and target-table are NULL, the table name is not combined or converted.
[[route-rules]]
pattern-schema = "example_db"
pattern-table = "table_*"
target-schema = "example_db"
target-table = "table"
"Loader Instructions" was last updated Jul 26 2018: *: add summary metadata to all docs files for SEO (#550) (c1e613d)
Improve this page

What’s on this page

Product

  • TiDB
  • TiSpark
  • Roadmap
  • TiDB Cloud

Docs

  • Quick Start
  • Best Practices
  • FAQ
  • TiDB Utilities
  • Release Notes

Resources

  • Blog
  • Weekly
  • GitHub
  • TiDB Academy
  • TiDB Community

Company

  • About
  • Careers
  • News
  • Contact Us
  • Privacy Policy
  • Terms of Service

Connect

  • Twitter
  • LinkedIn
  • Reddit
  • Google Group
  • Stack Overflow

© 2018 PingCAP. All Rights Reserved.

中文