ADF Transform Multiple Tables in Single Pipeline

Рет қаралды 20,319

Күн бұрын

This demo shows you how to use #Azure #DataFactory with #MappingDataFlows to transform and create multiple tables from a list of tables and iterate over each table even with different table schemas.

Пікірлер: 18

@neothomass 3 жыл бұрын

Your sound is like a nightly radio show host whose show comes late midnight.... :D (Compliment only)

@priyab9225 3 жыл бұрын

Hi, How can we perform delta/incremental load for multiple tables from azure sql database to azure sql database using single pipeline? So that I dont have to write merge SP for every tables.

@akshatanand8027 3 жыл бұрын

Hey, I am copy multiple table from sql db to blob using parameters.. And i have to find eror in the rows (error like data type mismatch or duplicate rows) and have to send the error rows to another destination. For this i am using conditional split but it is asking the column name and i am copying multiple table so i don't have any schema. Can you help me with this pls?

@LeonGraven 2 жыл бұрын

Did I hear you say - “trim() to make any string field a varchar or whatever”?? I tested this and only got navarchar(max) !!! Data flow seems only to create nvarchar(max) in SQL or Synapse from strings Is it possible to create a new table with for example a (n)varchar(10) ???

@adityaraj3321 3 жыл бұрын

Hi, ii want to make my source query to be dynamic by using dataset paarameters. How can i do it

@ankitdagreat 4 жыл бұрын

Hi ..good explaination but in case I have to work or evaluate a particular column on each table from my source dataset, I wont be able to do it. for example if my source tables have column as Load Date and I would like to copy those data in Sink where Load Date is greater than 31/12/2019 then I wont be able to do it. Can you show if there is a way around for this kind of scenario using dataflow

@MSDataFactory 4 жыл бұрын

Add a Data Flow parameter to your data flow as a String. Call it "myTableName". In the pipeline, set the myTableName param to the same value as the dataset table name parameter using "@item().table_name". In the Data Flow, add this query to the Source Options: "select * from {$myTableName} where Load_Date > 31/12/2019". You can also parameterize the date or use an expression function like currentDate() in your query.

@ankitdagreat 4 жыл бұрын

@@MSDataFactory and if my date is being compared with another watermark table load date. to be more precise, i am trying to follow this article( microsoft-bitools.blogspot.com/2019/05/azure-incremental-load-using-adf-data.html?m=1 ) using your method and it has a watermark column with load date and table name which is being evaluated.my tables in source has both these columns

@ankitdagreat 4 жыл бұрын

Also tried the way you said above but recieving error. here are my steps listed and its not working what you suggessted above and also what i trying to achieve drive.google.com/file/d/1pfBwC4k5J0mnOi96LPj-2twquhYQ5V4c/view?usp=sharing

@MSDataFactory 4 жыл бұрын

@@ankitdagreat In the Query field in Source, you need to add that string in the expression builder. Click on "Add Dynamic Content", then enter that string,

@ankitdagreat 4 жыл бұрын

@@MSDataFactory If you see the screen shot I shared in above link. I did the same for Source and it gives error on validation that 'Query' expression should return string

@ashutoshshukla1981 4 жыл бұрын

how would you create a hash key dynamically ?

@MSDataFactory 4 жыл бұрын

Try this technique using a hash function like sha2() with columns() to get all columns present in the current runtime context as pass them to the hash function: techcommunity.microsoft.com/t5/azure-data-factory/new-data-flow-functions-for-dynamic-reusable-patterns/ba-p/1394313

@ashutoshshukla1981 4 жыл бұрын

@@MSDataFactory Thanks, has this been added recently, don't see these functions in the version i'm using.

@ashutoshshukla1981 4 жыл бұрын

i got this now. I have another question, I have a pipeline that pulls the data from Azure SQL to Synapse in ADF, is it best to just leave it there and Push the data in Synapse or create a new Pipeline in Synapse and Pull the data from Azure SQL? considering both cost and the performance what will the recommended approach here?