Help & FAQ | Synthetic Data Generator

🚀

Quick Start Guide

Generate your first dataset in 4 simple steps

1

Choose Your Method

Upload an existing schema file (JSON, YAML, or CSV) or use the visual Schema Builder to create one from scratch. Try loading an example schema to see how it works.

2

Define Your Data Structure

Add entities (tables) and fields (columns). Choose from 100+ field types including names, emails, addresses, UUIDs, dates, and more.

3

Configure Settings

Set the number of records, choose your output format (JSON, CSV, SQL, or XML), select a locale, and optionally add realistic noise/errors.

4

Generate & Export

Click "Generate Data" and watch the magic happen. Copy to clipboard or download your synthetic dataset instantly.

💡

Pro Tip

Start with an example schema (Users, E-commerce, or Employees) to understand the structure, then modify it for your needs.

🏗️

Using the Schema Builder

Create custom data structures without writing code

📦

Adding Entities

Entities represent tables or collections. Click "+ Add Entity" and give it a name like "users", "products", or "orders". Each entity will generate its own set of records.

📝

Adding Fields

Fields are the columns in your entity. Specify the field name, choose a type from 100+ options, and optionally mark it as nullable or unique.

⚙️

Field Options

Some types have extra options: Enum lets you specify custom values, Number/Integer can have min/max ranges, Reference creates foreign key links.

🔗

Relationships

Connect entities with 1:1, 1:N, or N:N relationships. The generator ensures referential integrity across your dataset.

📤

Uploading Schema Files

Import existing schemas in JSON, YAML, or CSV format

JSON

{
  "entities": [
    {
      "name": "users",
      "fields": [
        { "name": "id", "type": "uuid" },
        { "name": "email", "type": "email" },
        { "name": "name", "type": "fullName" },
        { "name": "age", "type": "integer", "min": 18, "max": 65 },
        { "name": "status", "type": "enum", "values": ["active", "pending", "inactive"] }
      ]
    }
  ]
}

YAML

entities:
  - name: users
    fields:
      - name: id
        type: uuid
      - name: email
        type: email
      - name: name
        type: fullName
      - name: created_at
        type: datetime

📋

Field Types Reference

100+ field types organized by category

Category	Types	Example Output
Basic	`string` `number` `integer` `boolean` `date` `datetime`	"hello", 42.5, true, "2024-01-15"
Personal	`firstName` `lastName` `fullName` `email` `phone` `username` `age` `gender`	"John", "Doe", "john.doe@email.com"
Address	`address` `street` `city` `state` `country` `zipCode` `latitude` `longitude`	"123 Main St", "New York", "10001"
Business	`company` `jobTitle` `department` `product` `price` `creditCard` `iban`	"Acme Corp", "Engineer", "$99.99"
Internet	`url` `domain` `ip` `ipv6` `mac` `userAgent`	"https://example.com", "192.168.1.1"
Identifiers	`uuid` `id` `mongoId` `nanoid` `slug`	"550e8400-e29b-41d4-a716..."
Text	`word` `words` `sentence` `paragraph` `lorem`	"Lorem ipsum dolor sit amet..."
Custom	`enum` `weightedEnum` `regex` `reference`	Values from your custom list

🎛️

Error & Noise Settings

Add realistic imperfections for testing

Null Values Rate

Randomly replaces values with NULL to test handling of missing data. Recommended: 5-15%

Missing Data Validation

Typo Rate

Introduces character swaps, deletions, and substitutions in text. Great for testing fuzzy matching.

Text Quality Spell Check

Format Error Rate

Creates malformed emails, dates, and formatted fields. Perfect for testing input validation.

Validation Edge Cases

Duplicate Rate

Adds duplicate records to simulate real-world data entry errors. Test your deduplication logic.

Data Quality Deduplication

Outlier Rate

Generates extreme numeric values for testing edge cases and anomaly detection algorithms.

Analytics Edge Cases

⚠️

Recommendation

High error rates (>20%) may produce data that's too noisy for most testing. Start with 5-10% for realistic scenarios.

⚡

Performance & Capacity

Built for speed and scale

1M+

Records in seconds

100+

Field types

270K

Records per second

100%

Client-side processing

Schema Complexity	Records	Time	Output Size
Simple (5 fields)	1,000,000	~4 seconds	~90 MB
Medium (30 fields)	500,000	~15 seconds	~500 MB
Complex (80+ fields)	250,000	~24 seconds	~730 MB

✨

Reproducible Data

Use the seed feature in Settings to generate identical data every time. Perfect for consistent test fixtures and sharing datasets.

❓

Frequently Asked Questions

Quick answers to common questions

No! All data generation happens entirely in your browser using JavaScript. Your schemas and generated data never leave your computer. The app works completely offline once loaded.

Yes! You can manually recreate your schema using the Schema Builder, or export your database schema to JSON/YAML and upload it. The field types map closely to common database column types.

Use the Relationships tab to define connections. For example: create a "user_id" field in orders, then add a relationship from orders.user_id to users.id. The generator ensures referential integrity automatically.

Yes! Change the Locale setting to generate names, addresses, and locale-specific data in German, French, Spanish, Italian, Japanese, or Chinese with appropriate formats and characters.

Use the "Enum" field type and enter comma-separated values like "active,pending,cancelled". For weighted distribution, use "Weighted Enum" with weights: "active:70,pending:20,cancelled:10".

This can happen with large datasets for fields with limited values (like first names). Use UUID or ID types for truly unique fields, or reduce the record count.

Yes! The Synthetic Data Generator is completely free for personal and commercial projects. No usage limits, no sign-up required, and no watermarks on generated data.

For large datasets (100K+), try: reducing records, simplifying your schema, using Chrome or Firefox, closing other tabs, or generating in smaller batches.

Ready to generate data?

Start creating realistic synthetic datasets in seconds

Open Generator →

How can we help you?

Quick Start

Schema Builder

Field Types

FAQ

Quick Start Guide

Choose Your Method

Define Your Data Structure

Configure Settings

Generate & Export

Pro Tip

Using the Schema Builder

Adding Entities

Adding Fields

Field Options

Relationships

Uploading Schema Files

Field Types Reference

Error & Noise Settings

Null Values Rate

Typo Rate

Format Error Rate

Duplicate Rate

Outlier Rate

Recommendation

Performance & Capacity

Reproducible Data

Frequently Asked Questions

Ready to generate data?