Get Data from PDFs and Send to EXCEL with Power Automate Desktop!

  Рет қаралды 44,969

Christine Payton

Christine Payton

Күн бұрын

Power Automate Desktop is the lesser-known cousin to Power Automate Cloud. As of 2023, it's free with the latest version of Windows - so you may already have it installed! You can use it with a free Microsoft account OR with an existing M365 subscription.
In this video, we go through how to extract and parse specific values and metadata from PDFs by looking for words "next to" our values and send them to an Excel sheet. This is a beginner-level video, though it helps to be familiar with the concept of regular expressions for an easy intro into robotic process automation (RPA).
This technique can be particularly useful for legal and finance groups, who have a tendency to get filled PDFs routinely and then need to organize them!
0:00 Demo
2:00 Configure the Excel template
2:49 Power Automate: Get files in folder
4:42 Loop through files and extract text
6:24 Parse text using regular expressions
9:00 Split text to isolate your values
10:44 Send to Excel

Пікірлер: 121
@kidd32888
@kidd32888 6 ай бұрын
omg you saved my organization hundred of hours and this is not a hyperbole! THANK YOU SO MUCH!!
@bi-ome
@bi-ome 6 ай бұрын
aw that's so great to hear!
@PowerBeaver
@PowerBeaver 9 ай бұрын
Awesome, thanks for posting this. I started doing such kind of thing using vba, which was quite a journey😅. Didn't think PA Desktop could do the trick. 👍🏻
@bi-ome
@bi-ome 9 ай бұрын
Yeah!! I didn't realize VBA could do this, so we both learned something. 😆
@killiancolombo331
@killiancolombo331 6 ай бұрын
That is pretty cool!
@akilalaksiri9172
@akilalaksiri9172 2 ай бұрын
Thank you very much! You saved me :)
@iracypimenta440
@iracypimenta440 5 ай бұрын
First, thank you for your content. I might be doing something wrong, although I have followed yours instructions step-by-step, because I am only able to get even without the break just one iteration in my pdf file. Thanks again
@bi-ome
@bi-ome 5 ай бұрын
Did you add a breakpoint? The red circle? That will stop it after one loop, I think I added one in the video to show it... you can click the dot to turn it off. Also check what you're looping over for the actions and whether the thing that you're looping over contains all the files you want to loop on (you can check this by looking into the variable content after it runs).
@jerrysaen
@jerrysaen 3 ай бұрын
Thank you very much! This helps me with my business
@yandisafanqa6050
@yandisafanqa6050 9 күн бұрын
Thank you for this…exactly what I need
@duchoanChu
@duchoanChu 7 күн бұрын
Thanks so much! It was very useful to me
@schmidtnolan
@schmidtnolan 8 ай бұрын
Great information! I've modified this slightly to use 'crop text' to extract text between two headers. Would you have any suggestions for when the headers have slight variations? I'm thinking of running a subloop with each potential iteration based on the condition of if the text is found... the initial build will stink but should do the trick. Thoughts?
@bi-ome
@bi-ome 8 ай бұрын
Cool! It depends on what your variations are… I’ve used the switch/case cards for this, basically to check if it contains the word I’m looking for before parsing (with one case for each variant, and slightly different regex for each).
@peterolah1165
@peterolah1165 8 ай бұрын
great video Christine! I got a question: I got many PDFs with multiple values to search and 1-2 of these are missing. I got the index out of range error message (right after the write to excel command), I try the error handling option and an IF sub flow but didn't work.
@bi-ome
@bi-ome 8 ай бұрын
Try putting the if-statement above the parse text step - so it will only run the parse text and split steps if the file contains the text you're looking for. That way you don't have to error handle it, because it won't error. :) When you do this, make sure to have a step that resets any variable you were using to store values to blank at the end of the loop, so that doesn't insert the prior loop's value in Excel when the text doesn't exist.
@hvidsteen89
@hvidsteen89 5 ай бұрын
Thank you very much for a great tutorial. But do you have any idea, if I can get this to work with all the files combined into a single PDF instead of all my source material being split into seperate files?
@bi-ome
@bi-ome 5 ай бұрын
Probably, you can use a regular expression to collect all matches for your query - I had GPT write one to try, this looks for the word "balance" and gets anything after it so you can see if something similar would work: balance:\s*\$?(\d+[\.,]?\d*) It should put all the matches in a list, then you can loop over those to do things after splitting out whatever you don't need. BUT you might also try connecting to the file with Power Query (it can use PDFs as a source) and see what that gives you. The only reason I didn't use it in the example is because multiple files usually don't have the exact same structure between the files, but if yours is all in one file it might work. Alternately there's a ton of new AI options in the cloud Power Automate for entity extraction that use AI Builder credits, and you likely have credits if you have Power Automate licenses to try.
@user-cl1ir5ui4i
@user-cl1ir5ui4i 4 ай бұрын
This is some really powerful stuff, thank you. In the video you used some reglar expressions that you use, where can I find the complete list?
@bi-ome
@bi-ome 4 ай бұрын
Thanks! There's not a list of regular expressions per se - here's more info on them: coderpad.io/blog/development/the-complete-guide-to-regular-expressions-regex/ ChatGPT/copilot are very good at creating regular expressions, so I usually just start by giving them what I'm looking for and an example and it will usually work. Do make sure to tell it something along the lines of "just give me the regular expression, not Python, and do not use look-aheads or look-behinds" to get the "right" thing to plug in. :)
@dvwatts
@dvwatts 3 ай бұрын
Thank you for the video! Maybe I missed it but was there an option to deal with multiple choice or check box data types?
@bi-ome
@bi-ome 3 ай бұрын
That's a really good question. I didn't need to get checkboxes so I didn't think to try - I found a PDF with checkboxes to prod at and when I copy and paste a checked box out of the PDF there is no indication that the box exists from that or that it's checked, which makes me think the extract text isn't the way to go here. I tried doing an internet search, too, and didn't find anything that looked like a functioning example of someone doing this, so you've stumped me lol. It's not supported in AI Builder either, apparently, which is usually a good fallback. Though maybe with OCR there it'd be able to "see" the checkmark?
@jl6523
@jl6523 3 ай бұрын
She's good!
@gedelasaisravya2603
@gedelasaisravya2603 5 ай бұрын
Thank you helped a lot. I also need to extract address from the pdf which can be of different number of lines in each file. How do I parse that and write to excel?
@bi-ome
@bi-ome 5 ай бұрын
Depends what the files look like, you could try using an action or regular expression that gets text between two values to get everything in between whatever comes before and after the address - that should get it no matter how many lines there are. Not sure how it needs to be broken out, but regular expressions are great at parsing the street address/zip/state/etc from a text block. You can use if-conditions to run different expressions according to various things (e.g. count the line break characters to get a line count and use that, or if it is a certain country parse it differently etc).
@gioargentati7802
@gioargentati7802 8 ай бұрын
Thank you for posting this informative video. Sometimes, when I try to read text from a PDF using ExtractPDFText, instead of getting readable text, I get strange characters. Do you have any suggestions on how to handle this sort of thing? Thank you in advance, Christine!
@bi-ome
@bi-ome 8 ай бұрын
Are they scanned PDFs? Scanned PDFs can be very weird on the text end. I would try opening one that is having the issue and selecting the text inside the PDF and pasting it into a text editor just to see what it looks like - that'll tell you if the issue is the Power Automate action or the PDF itself.
@kbirstein
@kbirstein 5 ай бұрын
@@bi-ome I had the same issue. I just created a PDF with one line of text and I was able to select it in the PDF and copy it and paste it to Notepad and it appeared as text so I'm thinking this only works on special types of PDFs and it's impossible to know which ones it will work with . . .
@bi-ome
@bi-ome 5 ай бұрын
@@kbirstein Do you have an example of what the weird text looks like? It could be an encoding issue.
@CLTPHDdad
@CLTPHDdad 4 ай бұрын
Hi Christine, Wonderful video! I am wondering whether you could offer any direction on an issue I am having. Basically, I have followed the steps in your video to extract PDF text and have viewed the structure of the text. Resultantly, I have identified a static, unchanging text string represented in all PDFs in the file folder (i.e., the report name) on line 35 and I need the dynamic information (client name, client ID, and date) from the subsequent line (i.e., line 36) to be individual variables that will be incorporated into the Excel file in columns A, B, and C, respectively. The format of the string is: "Full Name | ID Number | MM.DD.YYY" and there is some additional unnecessary text that follows the date. Any direction you might provide would be greatly appreciated!
@bi-ome
@bi-ome 4 ай бұрын
Sure, I use ChatGPT to produce the regex because that saves a lot of time. It provided this to get the full text of line 36: (?:.* ){35}(.*) So if you take that with your parse text action and put it in a variable, then use the action that splits text (search for the word split in the action menu) and split on " | ", you will get each of the values you want in an array. You can reference those by position with %yourVarName[0]% for the first item, %yourVarName[1]% for the second, and so on. You can insert those in columns in Excel like we do in the example in your loop that loops over the docs. 😀
@jomarismeneses3765
@jomarismeneses3765 6 ай бұрын
Good video! How can I do if I have documents with different structures and information and I want to put them in the same Excel document but in different sheets of it? Thank you!
@bi-ome
@bi-ome 6 ай бұрын
You can add either an "if" branch or "cases" that checks the PDF text to see if it contains some particular unique string for each circumstance before extracting the text (you'd use a different extraction expression for each branch). There's actions for "get all Excel worksheets" that you could probably combine with "set active Excel worksheet" to change the worksheet before it gets the first free row to insert the data. You'd get-all outside the loop, then inside each conditional branch have it set the active sheet before it inserts. I'm not sure if the action that swaps sheets gives itself sufficient time to complete or not before continuing, you might need to add a short pause between that and inserting (just something to watch for, I've not tried it yet - might be fine). :)
@KomailButt
@KomailButt 6 ай бұрын
Thank you so much for providing such useful information. I need suggestion. I have document library folder in SharePoint where PDFs comes. In each PDF there is a table which is spreaded over multiple pages. PDFs also include text but I am only interested in extracting those tables from the PDFs. Please suggest the possible easier way to do this task. I am not sure whether I should use AI builder model (online) or this deskstop solution.
@bi-ome
@bi-ome 6 ай бұрын
I feel like AI Builder is probably going to be easier for your case if you have the credits for it. The method in this video is better for extracting very particular information from the text using what amounts to trigger words to find the content. It has the advantage of being "free", but is not as flexible as a LLM would be.
@komailbutt2998
@komailbutt2998 5 ай бұрын
​@bi-ome Thanks. I am looking for a free resource if possible in the first place, but let's see. By the way, how much credit does it require for doing this task, and do you know any other way to do this task for less money?
@bi-ome
@bi-ome 5 ай бұрын
@@komailbutt2998 AI builder credits come with most of the Power Platform license packages, so you might have enough floating around to work with. They’re a first-come-first-serve sort of thing at the tenant level.
@Gaurav19882001
@Gaurav19882001 3 ай бұрын
Thank You So Much for your lovely Support please make some more other tutorials on PDF to Exel mam Please it will be great help for All
@bi-ome
@bi-ome 3 ай бұрын
Is there something in particular you’re trying to do? The thing with Power Automate is it can do SO many things, it helps to know what people are looking for
@craigjenkins6917
@craigjenkins6917 3 ай бұрын
Great video. How would you extract the data from the text above rather than after the line. Thanks
@bi-ome
@bi-ome 3 ай бұрын
I had a conversation with ChatGPT about this, because I'm not a regex expert, it says that regex is designed around finding text after the target and not before. Is there static text before that you can use instead? The wrench is that Power Automate doesn't seem to accept "lookbehinds" in regex like some applications can, so you have to get creative.
@ryannedieu
@ryannedieu 3 ай бұрын
Thank you so much for the tutorial. I know someone has asked about dealing with out of range blank values. Could you elaborate more? I have a fairly simple PDF data extraction with two fields. However, one of them can have a lot of blank values. Thank you.
@bi-ome
@bi-ome 3 ай бұрын
There's lots of ways to do the error handling - you could put the part that selects the value and adds to Excel steps in an error block and set it to continue at the end of the block if it hits an error (which will happen when it tries to do things to a null value). That would basically just have it continue on to the next PDF if it doesn't find a value - not sure if you need to record the blank or not. You could also use the "if" conditional to check the isolated value (after you remove the word you're looking for) or the second position in the array, using a "is not blank" and put the update Excel actions there. Meaning it would only add them to the Excel file if they had a value. You could use "else" to set it to some specific text like "no value" probably too? The error handling takes some trial and error to get working the way you want it to.
@ryannedieu
@ryannedieu 3 ай бұрын
Thank you so much for the response. I’m able to get it work with your suggestions. Thank you!
@taesunyoo4637
@taesunyoo4637 3 ай бұрын
Hi Christine thank you for sharing wonderful video!. Quick question: does this work on Power Automate Cloud as well? sounds kinda silly question.
@bi-ome
@bi-ome 3 ай бұрын
Cloud doesn’t have regex per se, but you can extract from PDF with things like AI Builder to get what you’re after via prompt. The issue I have with it is that when it gets the file content from a PDF, it’s in base64 and has to be converted to be usable by AI… There’s various methods and workarounds for that, and alternatives you can use for more $$ like Syntex. The technique in this video is basically the budget DIY version of all that! 😅
@taesunyoo4637
@taesunyoo4637 3 ай бұрын
@@bi-ome thank you for your prompt reply. ah I was searching around KZfaq as well and like you mentioned I did see solution relying on AI builder to mark/highlight each data field from PDF file. Too bad that regular expression is not supported by cloud which is a bit shocking for me. thank you very much!!
@bi-ome
@bi-ome 3 ай бұрын
@@taesunyoo4637 You can use regex with Syntex/SP Premium! Just not Power Automate. Syntex is a more holistic version of all of this, it's premium priced but they put a UI around it and let you auto-extract the data and apply it to the item in the SP library.
@Steph-ho2gl
@Steph-ho2gl 4 ай бұрын
Super tutorial Christine! is it also possible to collect data from table for which response is below the key word, and not after? thank you!
@bi-ome
@bi-ome 4 ай бұрын
If you just want the one value, you can use your target text and (.+) as the regular expression. That should get everything on the next line. If you are wanting to extract the WHOLE table, that depends on the structure of the PDF - if it's just one table, Power BI can sometimes grab it, but if you have multiple tables or a table that spans multiple pages and isn't always in the exact same location, there is entity extraction with AI in AI Builder or Syntex.
@Steph-ho2gl
@Steph-ho2gl 4 ай бұрын
@@bi-ome thank you for your response Christine ! After further investigation, i discovered that Power Automate desktop has not only a functionality to extract text as you exposed in your video, but also the functionality to extract tables from PDF. Both functionalities are very complementary and very powerful. Thank you again !
@bi-ome
@bi-ome 4 ай бұрын
@@Steph-ho2gl Awesome! Want to link to the resource you used? I don't actually use PAD all that often, this video was based on a technique I picked up while scrambling to do a particular project, so I'm interested!
@Steph-ho2gl
@Steph-ho2gl 4 ай бұрын
Hi Christine, I used the following sequence : 1. Select on file using the action Message box - Select file >> Variable = “Selectedfile” 2. Extract table from PDF file [from SelectdFile into “ExtractedPDFTables” variable] 3. Message box where Power automate will list all the tables he identified in the PDF 4. Launch Excel (which I have prefilled with the header column I want to keep. 5. Insert “For each” to loop all the identified tables [Variable used = “ExtractedPDFTable”] a. Then embed one “if” condition and I specified a specific content on one header column. In fact, my PDF file contains 42 tables, and only one is interesting for my purpose because it contains a specific key word in the CurrentItem.datatable.columnHeadersRow Contains Key word”. i. Obtain the first free line in my excel >> It creates a variable “FirstFreeRowOnColumn” ii. Write in excel [CurrentItem.datatable]when condition is met b. End if 6. End the loop started in 5. Hope it can help you! @@bi-ome
@bi-ome
@bi-ome 4 ай бұрын
@@Steph-ho2gl Thank you so much for taking the time to share!
@user-cl1ir5ui4i
@user-cl1ir5ui4i 3 ай бұрын
Christine, this tutorial gets me most of the way to what I need done. The only problem is that the email I get from my vendor contains a link that I need to follow to get a web based copy of the invoice. To your knowledge is there a way to have Power Automate activate the link so I can scrape the webpage for the information I need and then save the webpage as a PDF with the name that I got from the document? In advance, thanks
@bi-ome
@bi-ome 3 ай бұрын
I think this will be a very difficult one but technically possible... you might want to go with Power Automate Cloud for this instead of desktop, since you'd be triggering it when an email is received. There is a comment in this particular thread that has what looks like a viable solution to me, using the HTTP connector in a cloud flow: ideas.powerautomate.com/d365community/idea/94b667cd-89cc-4c61-8982-79b53c1a6019 Alternately, you can parse email text in the desktop version, maybe do regex to get the URLs ( \bhttps?://[^\s]+ ), then use that to launch a web browser with the URL (from a variable), and save them... Web-based PDFs are a particular pain because the "save" button for a PDF is not something PAD recognizes as a UI element typically, so what I've done as a workaround is use "send keys" to print it to PDF (control p). When you're saving it, you can use send keys again to send a particular text string to the filename field based on a variable (I use send keys with "tab" to tab down to that field... lol... because it can't seem to select it with the recorder). Increase the time between keys slightly or it will get "ahead of itself" while things load.
@user-cl1ir5ui4i
@user-cl1ir5ui4i 3 ай бұрын
@@bi-ome Thank you. If I find a solution I will let you know.
@anushkatiwari1506
@anushkatiwari1506 7 ай бұрын
Thanks for the video, but I have a different request. I want to extract data from all the worksheets of a excel workbook using a loop and then print it in a pdf and save it. I want my user to first select the path of excel workbook and the path where the pdf should be saved. How can I do this?
@bi-ome
@bi-ome 7 ай бұрын
Printing is tricky because often the UI browser often has trouble selecting menu options from the print menu when it runs. What I have done in the past is use "send keys" to do all the navigation in the print/save menu, using tab to navigate through options and "enter" to "click" selected buttons. I increase the delay between keys to significantly above the default, because if they run too quickly together they will trigger before the prior one is finished applying. You can use the action that allows users to enter an input to collect the save path if it will be different each time, then use the path concatenated with the filename in the filename slot in the save menu - that will save the file to whatever path is entered (e.g. C:\Users\username\Documents\FolderName\filename.pdf). Since this isn't usually a selectable input field, I use "send keys" to send the filename/path to that box - the cursor will be there by default when the window opens, if it's not for whatever reason you can send more keys to get there. 😊
@munbingkong5401
@munbingkong5401 2 ай бұрын
Hello this has been a great help, may i know how do you copy and paste the extracted pdf into the notepad with numbers indicating each line for it ? Because when i did that , it only copy word for word and did not include any line numbers..
@bi-ome
@bi-ome 2 ай бұрын
The numbers were just in the PDF file, they're part of the table text. If you want row numbers in the text editor though, you can use a different text editor - e.g. VSCode would have that.
@munbingkong5401
@munbingkong5401 2 ай бұрын
@@bi-ome Hello thank you for replying me ! i have been doing this power automate for my corporate JDE software and out of 10 times maybe 3 times it will log in successfully , by chance , do you know what's the problem?
@JosephHadaCleveland
@JosephHadaCleveland Ай бұрын
Is there any chance you could do a video, if it is possible, on how to feed in a list of URLs from excel and save each page as a pdf to a folder?
@bi-ome
@bi-ome Ай бұрын
The only way I've gotten the web pages to PDF to work is to "print to PDF" via using "send keys" with hotkeys, with tab to tab through menus and enter to "click" print. Lol. It works okay that way but you need to have pauses in there to space out the send keys actions. You'd just loop over the Excel rows and launch a browser with the link as the address and close the browser at the end of the loop.
@tateeslick4695
@tateeslick4695 Ай бұрын
I have my table in the same format but when I view the detracted pdf text, it’s on a new line. Please show or explain what would be different when the heading I want is on a certain line and the text I want is on another line. In the pdf, they are next to each other.
@bi-ome
@bi-ome Ай бұрын
It might be "getting" a line break character that you can't see until it's in Excel. You can try one of the methods here and see if one of these works: www.reddit.com/r/excel/comments/zolxmo/how_do_i_clear_this_empty_line_breaks_for_each/
@narmididi6336
@narmididi6336 6 ай бұрын
Hello. How do i extract from PDF address info which is not on the same line? Thanks
@bi-ome
@bi-ome 6 ай бұрын
Regular expressions are really commonly used to parse addresses, my guess is Chat GPT or the internet has them readily available. I would try to get the address, then split on the line break and comma to separate out the street, city, and zip-
@sebastiansolis5279
@sebastiansolis5279 3 ай бұрын
This is great. Is there any Way to send info to a share list?
@bi-ome
@bi-ome 3 ай бұрын
Yes, there are SharePoint actions in the desktop app, but they will trigger premium licensing. You might consider using the Excel file to feed a PowerShell script, too, that would also work!
@jomelreyes9133
@jomelreyes9133 4 ай бұрын
Hi can you help, I cant proceed to Value to write field on Write to Excel worksheet action my flow variable is Invoice. In the latest power automate the default is =Invoice not %Invoice% and when I attemping to put [1] or =Invoice[1] it gives me an error "unexpected Chatacter"
@bi-ome
@bi-ome 4 ай бұрын
Do you have this feature turned on? "Power Fx enabled (Preview)." I heard from another commenter that this setting was on and it was changing the structure of all of the variable references. You might toggle it off and see if that does it. This video was created before this feature was added.
@jomelreyes9133
@jomelreyes9133 4 ай бұрын
​@@bi-omeIt worked, because I couldn't figure out how to disable Power FX, what I did was I just created a new flow, and yes, it worked. Thank you for your help.
@starkybaxter4867
@starkybaxter4867 2 ай бұрын
Hi, Is this also works on Leads data pdf sheet? I want them to extract to csv file
@bi-ome
@bi-ome 2 ай бұрын
I'm not sure what a Leads data pdf sheet is, but it will work on any selectable text data in PDFs. If you want it to go to CSV, you can save the Excel file as CSV type at the end of the flow. There's also an action to go directly to CSV from a data table variable, skipping Excel, but we're not using data table variables in this particular tutorial - either will work
@theretheheheheee
@theretheheheheee 4 ай бұрын
I have like a hundred pdf personal data, like ID card, which state name, dob, etc. Can I use this to transform it to excel rather than to typing it one by one ?
@bi-ome
@bi-ome 4 ай бұрын
Yes, as long as they follow a similar structure (e.g. state name is always after "State:") and the text is selectable in the PDF (not scanned). If you're working with scanned ID cards, you need to use OCR on the text first, which you could do with a call to an API service.
@piratefish5856
@piratefish5856 5 ай бұрын
As a new user doing my best to learn power automate, i thank you, your video has been greatly helpful. Your tutorial works great for me except for one detail. At the 12:30 mark in your video, where you are pulling the second position in the array and enter the [1], i cannot get this to work. None of my variables have % signs around them. i have been searching for how to turn on the % signs, or how to make it work without the % signs. How do i make this work?
@bi-ome
@bi-ome 5 ай бұрын
The % signs are how you reference variables - so you can type %your-variable-name% to use them in formulas. It's just how Power Automate knows you're making a reference vs just normal text - so you can type whatever you want in the box. They won't have percentages in the dropdown menu. 🙂
@piratefish5856
@piratefish5856 5 ай бұрын
@@bi-ome That was one of the first things i tried. It gives me an error that reads "Parameter 'Value to write': Unexpected characters. Characters are used in the formula in an unexpected way." I don't know if i am on a newer version or if there is a setting for the % signs. In your video, all your variables have % signs around them. In my use of power automate, nothing has a % sign around it.
@bi-ome
@bi-ome 5 ай бұрын
@@piratefish5856 You are using PA *desktop*, right? Like from your start menu? Power Automate Cloud is very different and they are often confused--
@piratefish5856
@piratefish5856 5 ай бұрын
@@bi-ome I am using the PA desktop app. Version 2.40.157.24023, which is the most recent version as of 01/29/2024. To try to be more specific, on your Write to Excel worksheet popup, in the Excel Instance block, on your video is shows "%ExcelInstance%" but for me i see "=ExcelInstance"; same thing for the Value to write block, your video shows "%InstitutionName%" by default, my screen shows "=InstitutionName"
@bi-ome
@bi-ome 5 ай бұрын
Hm your comments somehow hit the moderation filter, and when I approved them they disappeared... but you mentioned something about the variables not having % in them? I would go to the step that they are created in, e.g. the step that opens the Excel file, and check that they have percentages there. When I name a variable in a step, it auto-adds the percentage signs, but perhaps yours don't for whatever reason? Also they only show up when you're editing the step and click on the var name in the editor, if that's part of it.
@cbacca2999
@cbacca2999 Ай бұрын
What is the link to get the PDF file so I can follow along?
@bi-ome
@bi-ome Ай бұрын
Sure, I went back and checked and its not super straightforward to download them from the FFIEC site directly, you have to put in the institution ID to get each file (it's been a while), but I found what looks like the files I used on my PC and uploaded them here: christine-payton.com/resources-files/
@davidsides5398
@davidsides5398 6 ай бұрын
I get an error that says Index '1' is out of range, when it tries to write the second line item. Any idea?
@bi-ome
@bi-ome 6 ай бұрын
That usually means it’s blank, that there’s nothing in it. I would check to see what ends up in your variable when it runs, your expression may need adjustment-
@partththombre7446
@partththombre7446 Ай бұрын
​@@bi-ome I have got the same error , there is a [] sign besides my variable name please can you help me with the same??
@user-ws4ih1ev1q
@user-ws4ih1ev1q 4 ай бұрын
You just extracted the values once what should I do to run a loop and extract all the values from my data? Is there a way to do so?
@bi-ome
@bi-ome 4 ай бұрын
It's running in a loop on the files, so it'll extract each thing you configure it to once per document. and create a row in Excel for each. If your files are structured differently, you can provide more info about what you're trying to do and I can try to offer a suggestion. :)
@user-ws4ih1ev1q
@user-ws4ih1ev1q 4 ай бұрын
@@bi-ome See in my pdf I just dont have one id or one name I have multiple and I want to extract all of them one by one in excel so is there a way to do it? Like run a loop until all the values are extracted and then stop?
@bi-ome
@bi-ome 4 ай бұрын
@@user-ws4ih1ev1q Yeah, there is a toggle in the "parse text" action where you can toggle whether it gets the first occurrence only or all. So if the names always occur after the text "Name: " you'd use Name: (.+) for your regex, toggle off "first occurrence only" and it should just put them all in an array. You can reference those based on array position, or for-each loop over the items in the array to do things with them. That said, if your PDFs are complex, it might be a lot easier to use AI tools like Syntex (but more expensive).
@user-ws4ih1ev1q
@user-ws4ih1ev1q 4 ай бұрын
@@bi-ome Actually I used a python script to extract the data, the normal Parse Text was not working, so I dont know how to reach to the solution of my problem.
@stanTrX
@stanTrX 9 күн бұрын
Thanks. But it looks like too much time to set this up, so it should be something worth doing for repeated tasks maybe. Another thing is pdf file shall be in the same format. Can you also extract unstructured pdf s ?
@bi-ome
@bi-ome 9 күн бұрын
Yes? I would not generally do this for one handful of PDFs, it's just an example. 🙂 This will work on any PDFs that have selectable text in them, because the expression is just getting text next to or above or below other text, but not scanned PDFs.
@user-di3rl6ww2d
@user-di3rl6ww2d 4 ай бұрын
Hi what if the pdf info is all image and unable extract the data
@bi-ome
@bi-ome 4 ай бұрын
You would need to use OCR on it in the flow to get the text. There’s various tools to do this, easiest to use is probably AI Builder in Power Automate Cloud-
@cheerbear9059
@cheerbear9059 3 күн бұрын
Is there a way to do this with google sheets
@bi-ome
@bi-ome 3 күн бұрын
If you sync it with Google drive to your local computer, and then open it with Excel, maybe? The thing is most of the actions for spreadsheets are Excel-specific, so it really needs to modify the file in Excel unless you create a CSV or something instead. A CSV would probably work, but you'd be going about it very differently.
@AsaelCosentino
@AsaelCosentino 2 ай бұрын
Getting the problem all the time: "file is open in another application". Someone knows how to fix?
@bi-ome
@bi-ome 2 ай бұрын
You can only have the Excel file open in one instance. So if you are testing your flow, you need to remember to close the file between runs so that it’s not already open when the flow tries to open it :)
@coolandtrending4497
@coolandtrending4497 18 күн бұрын
Does this not work in power automate cloud version with files in SharePoint folder. This was awesome video but I don't use PA desktop version so asking 😅
@bi-ome
@bi-ome 18 күн бұрын
Good question! Power Automate Cloud does not have regular expressions and pulls the file content via API, meaning it comes in as base64 (garbled random letters). In order to do anything with it, you have to either translate it out of base64 (which requires a third-party connector that is not free) or use OCR on it as if it were an image to "recognize" the text and concatenate it all together. Then you'd use another AI connector on it, typically AI Builder or Azure Open AI, to parse the text and pull out the parts you want to put in Excel. Or use SharePoint Premium, which basically packages the above into a solution for you but is more expensive. So, long story short, since Power Automate Desktop is installed on most Windows machines and is free to use locally, it's often much less of a pita and less expensive to just use that vs cloud for this particular job. Power Automate Desktop essentially is just a robot that controls your computer where you can tell it what actions to take, so it bypasses the base 64 issue and allows regular expressions so you can extract without AI. But it's technically possible to get the same output with either.
@coolandtrending4497
@coolandtrending4497 18 күн бұрын
@@bi-ome you are right. Currently trying AI builder method. Again, thanks for the informative content.
@Eli22clem
@Eli22clem 4 ай бұрын
Didn't work (edit) After mapping the shared folder my pdfs were on to a drive it did work! Now just have to get it to pull all instances in each pdf and not just the first.
@bi-ome
@bi-ome 4 ай бұрын
Yes, quite literally nothing you will ever make in Power Automate desktop will work the first time you run it - it requires troubleshooting to get things working. That's part of the process. :)
@KutzKraftsNKreationz
@KutzKraftsNKreationz 2 ай бұрын
I need it for my business but i zoned out after 3 mins. has nothing to do with the video. I am just not tech savvy. how can I get a one on one to see if i can use it for my business?
@bi-ome
@bi-ome 2 ай бұрын
If you have any particular questions you can ask here and I'll try to answer. Any automation tool will require a technical person, or at least someone who can make it through a video or two lol. I would recommend this tool for people who use Microsoft 365, thus are already invested in Microsoft stack products. I used it here because it's what was available to me at the time.
@shivarajaprasadambati9632
@shivarajaprasadambati9632 8 ай бұрын
I have pdf where data having in other languages
@bi-ome
@bi-ome 8 ай бұрын
You can look for other language words with regular expressions too!
@MCAES01
@MCAES01 4 ай бұрын
you record your extra large screen and when we watch it on small screens or smartphones the words become microscopic get a dislike
@123456789santia
@123456789santia 7 ай бұрын
Do you work for google? cause your channel is full of knowledge that takes too much time to learn.
@bi-ome
@bi-ome 7 ай бұрын
Lol, no, thanks though - I have been using most of these tools a really long time. PA Desktop is pretty new for me, but I picked up this particular trick out of desperation trying to complete a particular project while doing a lot of mental flailing around trying to learn the tool. Thought it might save someone else some stress to share. 🤣
@lizzyvallejo8298
@lizzyvallejo8298 5 ай бұрын
Necesito extraer el Saldo Actual de un extracto bancario el dato se muestra así: Saldo Actual $ 1.234.567.890.23 y cuando hago el paso de analizar texto no me trae información, sabes que debo cambiar?
@bi-ome
@bi-ome 5 ай бұрын
Puede ser complicado cuando tienes que lidiar con comas y signos de dólar, porque el PDF es texto y no necesariamente reconoce que esos son parte del número. Yo intentaría algo así (usé GPT para obtener esta fórmula, es muy bueno para hacer expresiones regulares). Current Balance \$\s*(\d{1,3}(,\d{3})*) Explicación: Current Balance \$: Coincide con el texto literal "Current Balance $". El signo de dólar se escapa porque $ es un carácter especial en regex. \s*: Coincide con cero o más caracteres de espacio en blanco. (\d{1,3}(,\d{3})*): Captura el número. \d{1,3}: Coincide con 1 a 3 dígitos (para el primer grupo del número). (,\d{3})*: Coincide con cero o más grupos de una coma seguida exactamente por tres dígitos.
@aycampos
@aycampos 4 ай бұрын
@@bi-ome thank you so much for putting this video together, I have watched 3-4 times already and I almost finish my flow, the only thing is the same issue or similar as above, but my amounts do not have a dollar sign and are not next to the word but down below in a different row or underneath, and I have up to millions for the amount, could you please help me with the expression? Thank you
@bi-ome
@bi-ome 4 ай бұрын
​@@aycampos Yeah, the expression TargetText: (.+) should get whatever is on the line below the target text (replace target text with the key word you're looking for). I'd run it with that and see if you get something you can split parts off of. It depends on if you have other text in that second line or not. For all of these I am just using ChatGPT to help me write regular expressions, just give it an example of what your content looks like and ask it to write "regex without look-aheads or look-behinds". It's very good at regular expressions. 🙂
@aycampos
@aycampos 4 ай бұрын
@@bi-ome oh wow, thank you for getting back to me so quick, I'll try both the expression and ChatGPT.
PDF to Excel Converter
22:34
Kevin Stratvert
Рет қаралды 223 М.
How to Extract Data from PDF with Power Automate
29:30
Anders Jensen
Рет қаралды 208 М.
HOW DID HE WIN? 😱
00:33
Topper Guild
Рет қаралды 45 МЛН
Can You Draw A PERFECTLY Dotted Circle?
00:55
Stokes Twins
Рет қаралды 45 МЛН
One moment can change your life ✨🔄
00:32
A4
Рет қаралды 19 МЛН
Cat Corn?! 🙀 #cat #cute #catlover
00:54
Stocat
Рет қаралды 15 МЛН
Power Automate Desktop: Read Excel and Web Search - Beginners Tutorial
30:13
Extract Specific Data from PDF to Excel
4:30
Wondershare PDFelement
Рет қаралды 22 М.
Extract Data from PDF using AI Model with Power Automate
11:13
Properly Convert PDF to Excel
11:28
Leila Gharani
Рет қаралды 1 МЛН
Web Scraping Made EASY With Power Automate Desktop - For FREE & ZERO Coding
13:11
High-Income Excel Skills Worth Learning in 2024 (Free File)
29:19
MyOnlineTrainingHub
Рет қаралды 334 М.
GPT PDF & Image Data Extraction (Power Automate)
22:15
Tyler Kolota
Рет қаралды 13 М.
Power Automate Desktop: PDF Extraction and Application Entry
1:13:43
Anders Jensen
Рет қаралды 38 М.
⚡️Супер БЫСТРАЯ Зарядка | Проверка
1:00
WATERPROOF RATED IP-69🌧️#oppo #oppof27pro#oppoindia
0:10
Fivestar Mobile
Рет қаралды 19 МЛН
Что не так с раскладушками? #samsung #fold
0:42