Triggering a Data Pipeline
This feature enables a Provider Data Scientist /Business Analyst/Data Engineer to trigger a Data Pipeline in the Bristlecone NEO® Platform
Data Pipeline can be triggered in three ways:
- Automated Run/Trigger: The Data Pipeline will be triggered on a file upload from the Raw Data Lake Zones.
Note : A data pipeline is automatically triggered when a file is uploaded from the Data Lake Zones
- Scheduled Run /Trigger: The Data Pipeline can be scheduled to be trigged at a specified date/time
- On demand Run: Execute the Data Pipeline on demand
Scheduled trigger
Steps to perform a scheduled trigger
Step 1: Click on the Schedule icon as shown below
Data Pipeline can be triggered in the following granularities :
- Today
User can select this option to trigger the data for the present day.
User can select 12/24 hours format to select the specific time as shown below
- Tomorrow
User can select this option to trigger the data for the next upcoming day.
User can select 12/24 hours format as shown below
- Every
User can select this filter [ 12/24 hours ] to perform a scheduled trigger every day of the specific month as per the specific time declared by the user as shown below.
Note: User can select multiple days . If the user selects a particular date, then the data pipeline will be triggered on the that specific date of every month
Note: User can also select an entire week of a month based on a day of the week as shown below. Data Pipeline will be triggered every Friday of the month as shown below
- Next
User can select this filter which performs a scheduled trigger of the data pipeline at the specified time .Unlike the previous filter, this scheduled trigger happens once and only on the specific time line set by the user as shown below.
Note:User can select multiple days. In case a day selected by the user is already passed then the that date will be carry forwarded to the next month on the same day
- Daily
User can select this filter to schedule a data pipeline daily as shown below.
Note:Date filter is disabled for this filter
- Hourly
Data Pipeline can be scheduled on an hourly basis.
User must click on the “Every [Select the number of hours ] and specify the number of hour.
Ex : If the user selects every 2 hours then the data pipeline is triggered within an interval of 2 hours.
A list of granularities is provided to the user as shown below
Note:User can schedule a Data Pipeline to run from a range of [ 1 : 23 hours ] using this hourly schedule feature
Post setting the hourly parameters, click on Schedule button to trigger the Data Pipeline to run after the specified time interval
Note:The values can be reset using the Reset option as shown below
Note:User can remove a schedule using the following option
Post schedule , data pipeline will be listed as shown below
The Data Pipeline triggered is populated in the Data Pipeline list as shown below
Note: When a pipeline is scheduled, the pipeline will get invoked only on schedule time. In case of new files getting ingested on the configured data source path, the pipeline won’t get triggered
On demand trigger
Steps to trigger a Data Pipeline on demand
Step 1: Select a Data Pipeline and click on Run
Step 2: Select the file and click on Run button
To view Execution Details, click on Data Pipeline name from Pipeline Summary dashboard