I have created an online service which provides enhanced data generation. If you need more than MySQL data, see Wedgeup Online Data Generator. |
---|
A tool to generate insert statements for MySQL Database. The main purpose of this tool is enable data generation for a full relational schema in real time scenarios.
The tool requires java 21. The code is developed using Temurin 21 JDK.
The tool is packaged as an executable jar file named as data-conjurer-<version>.jar
. You can download the latest version either from maven central or github release page.
To build the executable jar file from the source, the binary file can be created through maven command after cloning the repository
mvn clean install
It creates a data-conjurer-.jar file under conjurer-shell/target folder. The current version is 1.0.0.
java -jar data-conjurer-<version>.jar schema.yaml plan.yaml
- schema.yaml: defines data structure
- plan.yaml: defines data generation plan such as rows
Please refer to Data Conjurer Reference for details of how to create these two files.
There are many sample files under examples
folder. For instance the following command is to create data using schema and plan files defined under examples/helloworld
folder.
java -jar data-conjurer-1.0.0.jar examples/helloworld/schema.yaml examples/helloworld/plan.yaml
The output files are named using format ${applyOrder}_${entityName}.sql. The above command will output two files
- 0_country.sql: sql insert statements for country table which should be applied first
- 1_city.sql: sql insert statements for city table which should be applied after
For each run the tool will also create an output.log
under current directory with any execution logs.
The following configurations can be used to control behaviors of this tool
Short Format | Long Format | Description | Default |
---|---|---|---|
c | max-collision | Max occurrence of generated records which violate index constraints | 100 |
e | entity-timeout | Single entity generation timeout in minutes | 5 minutes |
i | wait-interval | Wait interval of data generation service to check entity status updates in seconds | 10 seconds |
p | partial-result | Allow partial results of entity generation | false |
t | timeout | Program execution timeout in minutes | 15 minutes |
The following is the output of java -jar conjurer-shell/target/data-conjurer.jar -h
Usage: conjure [-hpV] [-c=<maxCollision>] [-e=<timeOutInMinutes>]
[-i=<generationInterval>] [-t=<maxTimeout>] <schema> <plan>
Command to generate data
<schema> Data schema file
<plan> Data generation plan
-c, --max-collision=<maxCollision>
Max occurrence of generated records which violate
index constraints for each entity
-e, --entity-timeout=<timeOutInMinutes>
Single entity generation timeout in minutes
-h, --help Show this help message and exit.
-i, --wait-interval=<generationInterval>
Wait interval of data generation service to check
entity status updates in seconds
-p, --partial-result Allow partial results of entity generation
-t, --timeout=<maxTimeout>
Program execution timeout in minutes
-V, --version Print version information and exit.
When generating large rows of the data, you may want to increase max-collision, entity-timeout, timeout values accordingly.
partial-result allows the tool to generate data for next entity after max-collision of current entity is reached. It's experimental, use it with caution.
https://github.com/taodong/data-conjurer/wiki/Data-Conjurer-Reference