CLI¶
Run ETL¶
The user interface for running the ETL process is available as a command-line interface (CLI). You can run it using the following command:
run_etl¶
Extracts, verifies, cleans, and loads datasheet images.
Extracts data from the images in the input directory, verifies the extraction with the user, cleans and validates the data, and loads it into the output directory.
Returns:
Path to the saved cleaned data file.
Usage
run_etl [OPTIONS]
Options
- --input_dir <input_dir>¶
Required Path to the input directory containing datasheet images.
- --output_dir <output_dir>¶
Path to the output directory where processed data will be saved. If empty path, defaults to a dated directory in the current working directory.