The ETL Process

There are four steps in the ETL process:
  1. Use custom extraction programs to extract customer data to flat files. Flat files can be categorized as one of the following UDE types:
    • Customer master data
    • Customer attribute data
    • Customer subscription interests
  2. Transfer flat files to the designed directory. Customers should use whatever file transfer mechanism their IT group has set up for the Conduits for Connect implementation and place the UDE(s), file list, and indicator file in the SrcFiles directory.

    The Informatica mappings always look for an indicator file before running the load operation. Even if the UDE(s) and file list file are uploaded into the SrcFiles directory, the job will not initiate data loads unless an indicator_*.dat file is detected in the SrcFiles directory.

    The File Wait job associated with the Informatica mappings is constantly waiting to detect the presence of an indicator file. When it detects an indicator file, the program reads the file list file to determine the appropriate UDEs and the correct mapping to process.

    For very large input data files (that is, more than 200,000 profiles), expect a few hours of turnaround time before the job finishes.

  3. Use Informatica mappings to process flat files and load the raw data into Connect ETL staging tables.
  4. Informatica automatically processes the ETL staging tables and loads the processed, validated data into Connect customer schema tables.

    Informatica mappings define how data is processed when it moves from the flat file(s) to the Connect ETL staging tables. After you load the data into the Connect ETL staging tables, a post-session command takes data from the staging tables, validates data format and content, and loads validated data records into Connect.