To check file existence, but it didn’t work at least with blob files and (Select Max(BatchId) From BatchControl Where BatchName = 'GAVFiles' AND Completed = 0)) Bīe done, because “Copy” activity fails if source container is empty, which inīehavior for “Copy” activity is very odd. (Select Max(BatchId) From BatchControl Where BatchName = 'GAVFiles' AND Completed = 1)) A Select A.BatchDate As BatchDate1, B.BatchDate as BatchDate0 After processing the files I have a “Lookup” activity, whichĬhecks the batch dates between current and previous runs to identify has there Shall be “Execute SSIS Package” activity, which shall process the files fromīlob storage. If value is greater than previous value, because I’m not sure in which orderĪDF processed the files. UpdateBatch stored procedure I have a logic, which updates BatchDate value only Properties an expression to be able to handle names, which are Tricky part in defining the source data set. I have a sub pipeline where I copy file from on-premise file share to Azure Note ADF is not able compare datesĭirectly, so I use ticks-functions to convert dates to integer values. There I compare previous max batchĭate and file metadata created date. Next I have “If Condition” activity toĭecide if file should be processed or not. Have again “Get Metadata” activity, but this time it processes only single fileĪt the time and gets “Created” metadata. The last part you need check from the documentation or when debugging, from theĮach” activity you define sub activities, which is like own pipeline. Use again expression, which read previously created file list. expression: ‘myFileNamePrefix’) ) to process only correct files. Solution to avoid this situation is to use a dummy file in your source and “If Condition” activity with file name check (e.g. Third note, “Get Metadata” activity fails, if you use “Child Items” metadata attribute, but your source container/folder is empty. Other note, this approach is slow if you have hundreds or thousands of files. Note, that in Dataset properties File value should be empty to be able to use “Child Items” metadata. Because there are many files in the source, I use “Child Items” metadata attribute, which produces an array of file names. You could maybe achieve the sameįunctionality by using “Lookup” activity, because you can write there free handĭefault value for my batch date the value I just selected in my previousĪctivity, I’ll explain later why. I had to deduct two hours from the value to get correct comparison value.Ĭreate new row into BatchControl table. Interesting feature to change the timestamp (sometimes) to UTC time. (Select Max(BatchId) From BatchControl Where BatchName = 'GAVFiles' AND Completed = 1) Select BatchDate, DATEADD(HH, -2, BatchDate) AS BatchDate2 “Lookup” activity to get previous executions max file creation date. Into BatchControl table for the next run. I decided to use File Creation Date and store that Is a nice component called “Get Metadata”. Instructions for this process in one place. I wanted to write about this, because I couldn’t find any good Even ADF is missing couple of critical features, but I managed to To use Azure Data Factory (ADF) and Azure Blob Storage to tackle thisĬhallenge. Is not easily done without possibility to move the original file. SSIS without moving original files anywhere. My business problem was to process files on On-Premise file share with Promised to blog about Azure Data Factory Data Flows, but decided to do thisįirst.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |