Fakelake
FakeLake is a command line tool that generates fake data from a YAML schema.
Example
Here is a YAML file that will generate 1 millions rows with 4 columns.
columns:
- name: id
provider: Increment.integer
start: 42
presence: 0.8
- name: first_name
provider: Person.fname
- name: company_email
provider: Person.email
domain: soma-smart.com
corrupted: 0.0001
- name: created
provider: Random.Date.date
format: "%Y-%m-%d"
after: 2000-02-15
before: 2020-07-17
info:
output_name: all_options
output_format: parquet
rows: 1_000_000
Click here to create your YAML file. Click here to generate from a YAML file.