RDS Data Generator Help Page

How to access the tool?

Online version

Access link to the online Aurora Data Generator is here

Requirement: Your database must be publically reachable using the port entered in the connection screen.

Architecture diagram:

Login screen

Private deployment

Deploy in your own Elastic Container Service - ECS - cluster populating from the public registery: Repository

Requirement: The tools listen in http mode on port 80 and access the target database using the port your provided.

Architecture diagram:

Login screen

The Cloudformation template below automates the creation of your private environment. Download link
The Cloudformation template will create a VPC and a public subnet (both customizable), an ECS Fargate cluster and deploy the service running RDS Data Generator. After that, it is needed to create the right VPC peering allowing RDS Data Generator to access your database running in a private subnet.
A video explaining how to setup VPC peering is available here


How to use the tool?

Login page

Login screen

Select first the target DB engine. Then provide the requested information in order to get logged in. The table name is the target table where the data will be loaded.

Setup page

Table description

The section below describes the target table of the data load. The data generator template MUST matches the columns order and type

Login screen

Upload the data template file

This section describes the random data to be generated per column with the range associated. The table below describes the supported functions.

Supported format Parameters Description
randomString mini|maxi This function generates a random string. Length is variable between mini and maxi.
uniqueID None This function generates a UUID.4 value. 32 characters long.
nullValue None This function return a NULL.
randomNumber mini|maxi|step Send a random number picked up between mini, maxi with a step defined in parameters.
randomFloat mini|maxi Returns a random float number picked between the mini and the maxi up to 2 decimal places.
gaussNumber mu|sigma This function selects a random number in a Gauss curve.
randomList List Select randomly a value from the provided list.
randomListWeight list|weightlist Select randomly a value from the provided list. The randomization is mitigated with the wighed list.
timeStamp None This function return the current UTC timestamp. The target column need to be Timestamp format.
randomForeignKey table name| primary key column This function inserts a foreign key into the table by selecting randomly a primary key value from another table.
randomDate Oldest date|Soonest date This function generate a random date between the min and max. Template file entry must match: randomDate|1700-01-01|2024-04-21
randomTime earliest time|Latest time This function generate a random time between the min and max. Template file entry must match: randomTime|08:00:00|22:00:00 - Not supported with Oracle.
randomTimeStamp Oldest date|Soonest date This function generate a random timestamp between the min and max. Template file entry must match: randomTimeStamp|1970-01-02|2024-04-21. Oldest date must be > '1970-01-01'
sequenceNumber First value Generate a sequential number, increment by 1, starting with the "First Value" parameter. It requires an Integer as column type. This function procides a unique value.
randomFirstName List of countries | balance ratio The list of countries is either the list here or a subset: ['Netherlands','Spain','Germany','Czech Republic','Poland','Norway','Italy','United Kingdom','France','Sweden','Denmark','Brazil','United States','Australia','Tunisia'].
The "balance ratio" parameter weight one country versus another (see randomListWeight as an example). Format is [1,5,3,4,2] with as many parameters as countries in the first list. This parameter is optional.
The names and city are localized. When a country is selected (for firstname for instance), all fields in the records will match the same country (lastname, city & country).
randomLastName
randomCity
randomCountry
Login screen

Click the link to download an example of the parameter file Download link

The screenshot below shows how to upload the template file before submitting it for analysis.

Login screen

Once submitted, the template file is analyzed. A report page gives a status. The load cannot be started until the file is error free.

Login screen

Select the data ingestion mode and rate

This fieldset allows to select the number of records ingested into the target target table. You can select one of the two following modes:

Random Mode: The number of records is generated randomly every minute following a daylight curve (more records during the week days, less during the night or the weekend). This is the default mode.

Flat Mode: You select the number of records ingested every minute using the slider. Every thread (see the load charge section below) will ingest this specific number of records. Please note that the maximum is not guaranteed as every execution loop is reinititialized every minute.

Login screen

Auto Stop mode

This option tells RDS Data Generator to stop inserting records after a period of time of after a number of records.

None: Means that the user will manually stop the load.

Minutes: the data generation will automatically stop after the number of minutes defined in the open field. 10 minutes in the example below.

Records: the data generation will automatically stop soon after the number of records has been inserted. The number of records inserted will be sligthly above the value provided.

Login screen

Select the load charge ratio

One to 10 threads of injection can be started simultaneously. The checkbox below allow you to select the ratio.

Login screen

Reporting page

Once the load started, the page below appears. A statistics table monitors how many records where inserted. This table is automatically refreshed every 15 seconds.
The number of records inserted is randomly defined every minute in every thread. It also depends on the hour of the day and the day of the week in order to mimic a human facing web-based application. To share an estimation, an "heavy" load, generates 6,000 records in 10 minutes.

Login screen

Contact

For any question, contact Yann Allandit - allandit@amazon.ch

Updates

Version Date Updates
1.0.0 January 2024 Initial release
1.0.3 February the 3rd, 2024 Added support for MySQL/MariaDB + DB2. Extend # of threads. Bug fixes.
1.0.4 February the 16th, 2024 Added support for MS SQL Server & Oracle.
1.0.5 February the 21st, 2024 Added foreign key function. Update documentation.
1.1.0 March the 3rd, 2024 New interface. Bug fixes.
1.1.1 March the 15th, 2024 Bug fixes.
1.1.2 March the 24th, 2024 Multi-session improvement.
1.1.3 March the 28th, 2024 Added randomTime & randomDate functions.
1.1.4b April the 5th, 2024 Added randomTimeStamp function and fix timestamp & randomDate for Oracle. Presentation fix.
1.1.5 April the 8th, 2024 Added activity monitoring.
1.1.6b April the 9th, 2024 Added Pace Maker mode for flat data ingestion rate. Logo.
1.1.7 April the 16th, 2024 Added sequenceNumber function.
1.1.8c April the 18th, 2024 Added Auto Stop capability + Auto refresh of the statistics table while load is running.
1.1.9 April the 19th, 2024 Added First Name, Last Name, City & and country functions with selected regionalization. Change in monitoring + stop mgmt.
1.1.10b May the 23rd, 2024 Multi-session bug & Java version fixes + other fixes.
1.1.11a June the 7th, 2024 Increased scaling boundaries.
1.1.12c September the 13th, 2024 Test sequenceNumber value before load. Check ID column size for uniqueID function before load. Fix case issue with MySQL. Increase scaling bundaries. Created cloudformation template for private deployment.