Split Excel Data into Multiple Files by Column Values using Python (fast & easy)

  Рет қаралды 23,680

Coding Is Fun

Coding Is Fun

3 жыл бұрын

👉 Explore All My Excel Solutions: pythonandvba.com/solutions
𝗗𝗘𝗦𝗖𝗥𝗜𝗣𝗧𝗜𝗢𝗡
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
This tutorial will show you how to separate Excel Data into Workbooks by Column Values. Instead of VBA, we will be using Python and the Pandas Library. The Python code from this tutorial can be a huge time saver. You can apply the code from this tutorial also any other Excel - you just need to tweak it slightly.
Download all the files & source code from the tutorial:
👉 github.com/Sven-Bo/split-work...
𝗧𝗢𝗢𝗟𝗦 𝗔𝗡𝗗 𝗥𝗘𝗦𝗢𝗨𝗥𝗖𝗘𝗦
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🆓【𝗙𝗥𝗘𝗘】Excel Add-in (𝗠𝘆𝗧𝗼𝗼𝗹𝗕𝗲𝗹𝘁): pythonandvba.com/mytoolbelt
📊 Dashboard Excel Add-In (𝗚𝗿𝗮𝗳𝗹𝘆): pythonandvba.com/grafly
🎨 Cartoon Charts Excel Add-In (𝗖𝘂𝘁𝗲𝗣𝗹𝗼𝘁𝘀): pythonandvba.com/cuteplots
🤪 Fun Emoji Excel Add-In (𝗘𝗺𝗼𝗷𝗶𝗳𝘆): pythonandvba.com/emojify
📑 Excel Templates: pythonandvba.com/go/excel-tem...
🎓 My Courses: pythonandvba.com/go/courses
📚 Books, Tools, and More: pythonandvba.com/resources
𝗖𝗢𝗡𝗡𝗘𝗖𝗧 𝗪𝗜𝗧𝗛 𝗠𝗘
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔗 LinkedIn: / sven-bosau
📸 Instagram: / codingisfun_official
💻 GitHub: github.com/Sven-Bo
💬 Discord: pythonandvba.com/discord
📬 Contact: pythonandvba.com/contact
☕ 𝗕𝘂𝘆 𝗺𝗲 𝗮 𝗰𝗼𝗳𝗳𝗲𝗲?
If you want to support this channel, you can buy me a coffee here: pythonandvba.com/coffee-donation

Пікірлер: 164
@CodingIsFun
@CodingIsFun 3 жыл бұрын
*What kind of Excel Automation would you like to see next?* Let me know in the comments 😃
@cendrabinarto
@cendrabinarto 2 жыл бұрын
how about splitting into separate excel files based on multiple columns? e.g. extract a file when data in excel column A (go to zoo?) is "yes", extract a file when data in excel column B (go to museum?) is "yes" and so on. So we will have separate excel files "go to zoo", "go to museum" and so on. Thank you!
@CodingIsFun
@CodingIsFun 2 жыл бұрын
@@cendrabinarto Thanks for your video suggestion!
@faisalmehmood6889
@faisalmehmood6889 2 жыл бұрын
Hi, I did saw all your videos and be honest help me a lot. One of them was about sending text / WhatsApp messages from excel using API’s. Here comes suggestion; can we send a sheet (as personal attachment) to different email address at once using excel & python. Each person will get his attachment base on email address and sheet location.
@CodingIsFun
@CodingIsFun 2 жыл бұрын
@@faisalmehmood6889 Thank you very much for the suggestion!
@macxcan
@macxcan Жыл бұрын
How to average the Gross sale column by country or by segment or by product and return result on the same page but on different location?
@daveg4327
@daveg4327 7 ай бұрын
Thank you so much. You helped me automate 3 hours of weekly work for an FTE. Great video
@CodingIsFun
@CodingIsFun 7 ай бұрын
Glad I could help! Thanks for watching and your comment! :)
@marco_rock_60s
@marco_rock_60s 6 ай бұрын
That was perfect, exactly what I was looking for, thanks!! Concise, yet complete. :)
@CodingIsFun
@CodingIsFun 6 ай бұрын
Glad you liked it. Thanks for watching and taking the time to leave a comment! 👍
@torque6389
@torque6389 6 күн бұрын
This video is gold! Thank you!
@CodingIsFun
@CodingIsFun 5 күн бұрын
Happy to hear that it was useful; thank you for taking the time to leave a comment and for watching the video! Cheers, Sven ✌️
@mayday_excelgroup
@mayday_excelgroup 10 ай бұрын
Best thing I’ve ever seen so far about excel+python
@CodingIsFun
@CodingIsFun 10 ай бұрын
Glad to hear you liked it! Thank you for commenting and watching.
@bak1970
@bak1970 3 жыл бұрын
Hello , coding is fun. Sorry I don't know your good name . You are welcome and thanks for your comments below. I actually love python coding. I came to learn it as I mentioned before for more than 1 year now . With your videos you are showing how much we can do with python. I prefer using python instead of VB since it is much easier and script oriented and it is easy for me to learn. Thank you again.
@CodingIsFun
@CodingIsFun 3 жыл бұрын
I am super happy to hear that my videos are helping you on your Python journey :) Cheers, Sven :)
@bak1970
@bak1970 3 жыл бұрын
Nice coding. I just need to look into it more deeply to make sure I understand it well. Thanks
@CodingIsFun
@CodingIsFun 3 жыл бұрын
*Hello Bassel Khaled, thanks for watching the video and your comment.*
@eryabolonha
@eryabolonha 2 жыл бұрын
New subscriber Thank you ✨ This would definitely save me a lot of time I followed your tutorial which made me super happy 😊 but I need the master file format and data validation sheet 🙃
@CodingIsFun
@CodingIsFun 2 жыл бұрын
*Welcome aboard!* Thank you for watching the video & for taking the time to leave a comment! 😃
@lopher70
@lopher70 3 жыл бұрын
I loved! Thanks
@CodingIsFun
@CodingIsFun 3 жыл бұрын
*Hi Ricardo López-Herrera. I am glad you liked the video; thanks for watching and for the comment!*
@sureshgoku
@sureshgoku 3 жыл бұрын
Hey great video enjoyed it thanks alot.
@CodingIsFun
@CodingIsFun 3 жыл бұрын
Thank you, glad it is helpful! 😃
@marciorodriguesmota7927
@marciorodriguesmota7927 Жыл бұрын
o video que eu estava procurando h'a muito tempo, thankss
@CodingIsFun
@CodingIsFun Жыл бұрын
Happy to hear that it was useful; thank you for taking the time to leave a comment and for watching the video!
@carlosvalbuena6017
@carlosvalbuena6017 Жыл бұрын
Great video!
@CodingIsFun
@CodingIsFun Жыл бұрын
Glad you liked it. Thanks for watching and taking the time to leave a comment!
@ridohconcepts8868
@ridohconcepts8868 3 жыл бұрын
Awesome! tutorial!
@CodingIsFun
@CodingIsFun 3 жыл бұрын
*I am glad you liked the video, thanks for watching and for the comment!*
@MrPaddy35
@MrPaddy35 2 жыл бұрын
much needed :))
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Glad it was useful! Thank you for watching the video & your comment! :)
@hassanjatta4257
@hassanjatta4257 3 жыл бұрын
Awesome!!!
@CodingIsFun
@CodingIsFun 3 жыл бұрын
Thank you for watching the video & your comment! :)
@ranjanagajbhiye9280
@ranjanagajbhiye9280 Жыл бұрын
Great video.
@CodingIsFun
@CodingIsFun Жыл бұрын
Thanks!
@user-qp8ei8lc3d
@user-qp8ei8lc3d 3 жыл бұрын
Nice! Can you bring a video explaining the difference between iterating in a df with itertuples (), iterrows () and .to_dict ()?
@CodingIsFun
@CodingIsFun 3 жыл бұрын
Thank you for watching the video & your video suggestion. Both methods .itertuples() and .iterrows() are used to loop over a dataframe. Instead of using "for i in range(len(df))", you should/could use .itertuples() or .iterrows(). The difference is that Pandas’ iterrows() returns the index of each row and the data in each row as a *Series*. itertuples() on the other side, loops through rows of a dataframe and return a *named tuple*. itertuples is generally faster than iterrows() Here you can also find a great example of how to use both methods: cmdlinetips.com/2018/12/how-to-loop-through-pandas-rows-or-how-to-iterate-over-pandas-rows/ *.to_dict()* is used to convert the DataFrame to a dictionary. Example below: col1 col2 row1 1 0.50 row2 2 0.75 df.to_dict() {'col1': {'row1': 1, 'row2': 2}, 'col2': {'row1': 0.5, 'row2': 0.75}} Hope this helps 🙏
@mohammedsamad2214
@mohammedsamad2214 Жыл бұрын
thank you very much
@CodingIsFun
@CodingIsFun Жыл бұрын
My pleasure! Appreciate you taking the time to watch and leave a comment. 👍
@nortonofnorthamerica
@nortonofnorthamerica 3 жыл бұрын
Great content
@CodingIsFun
@CodingIsFun 3 жыл бұрын
*Hi Sean Faherty. I am glad you liked the video; thanks for watching and for the comment!*
@Pavel64reg
@Pavel64reg Жыл бұрын
Большое спасибо🙏
@CodingIsFun
@CodingIsFun Жыл бұрын
You're very welcome! Thank you for watching and for the comment! ✌
@austenlowitz8872
@austenlowitz8872 Жыл бұрын
Great video! Thanks so much. Do you know how we can split the files while maintaining the excel formulas in the master file?
@CodingIsFun
@CodingIsFun Жыл бұрын
Thanks for watching. That is be possible with openpyxl or xlwings. Happy Coding! :)
@kevinthomas4911
@kevinthomas4911 Жыл бұрын
Great videos! Thank you so much! Quick question: What if the column values I am splitting by are numeric instead of strings? I keep getting an error with 'str.contains', but when I use a different column that are string values everything works fine.
@kevinthomas4911
@kevinthomas4911 Жыл бұрын
ah, I was able to import column as object and that fixed it.
@CodingIsFun
@CodingIsFun Жыл бұрын
Thanks for watching. You could convert the respective column first to strings: stackoverflow.com/a/22006514 I hope it helps! Happy Coding!
@MsIzreen
@MsIzreen 2 жыл бұрын
Hi! Thanks for the video as it help me a lot. Can I know how to save the final multiple file into csv?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching. Instead of 'df.to_excel()'. use ''df.to_csv()''. I hope it helps! Happy Coding!
@Murthyy24
@Murthyy24 2 жыл бұрын
Hi, thanks for the great video. What if the master excel had multiple sheets and how do we split it into multiple excels ? Also all sheets may not have data for all countries.
@CodingIsFun
@CodingIsFun 2 жыл бұрын
I am not entirely sure if I understand your use case correctly or what you are trying to achieve. Yet, based on the information given, you might want to look into xlwings to help you with the automation. E.g. here is a video on separating each sheet as a separate file (kzfaq.info/get/bejne/qc5oo6ijuJiVonk.html). Once you have the different files, you could combine the code from this video to split your table further (based on the country) column. I hope this helps!
@hadi11911
@hadi11911 2 жыл бұрын
Hi, thank you for the video, can you also share a video on the same method but in the case where we have multiple sheets in a file that are also interlinked with formulas like for example an input file "X" has 2 sheets "1" & "2" and both have countries A, B & C so now how do we split the file "X" so that it is split into file "A" that has sheet 1 & 2 in it with formulas and data for country A only. similarly files B and C are created with both sheets and formulas with relevant data. Thank you
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching the video & your great suggestion! Unfortunately, I cannot make any promises to create the requested solution.
@ahmedhelal920
@ahmedhelal920 2 жыл бұрын
thanks for this useful information , please how can i make filter with specific unique value ? specific unique value on other column with country column?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching the video. Please find below an example of how to filter multiple columns. I hope this helps! _______ from pathlib import Path import pandas as pd EXCEL_FILE_PATH = Path.cwd() / "YOUR_EXCEL_WORKBOOK.xlsx" # Load financial data into dataframe data = pd.read_excel(EXCEL_FILE_PATH, sheet_name="Data") # Get unique values from any particular column column_name = "YOUR_COLUMN_NAME" unique_values = data[column_name].unique() # Query/Filter the dataframe and export the filtered dataframe as an Excel file for unique_value in unique_values: data_output = data.query(f"{column_name} == @unique_value & Year==2021") output_path = Path.cwd() / f"{unique_value}.xlsx" data_output.to_excel(output_path, sheet_name=unique_value, index=False)
@veginetikalyanprasadchowda7041
@veginetikalyanprasadchowda7041 Жыл бұрын
I need a solution in one column i have a lot of data and blank cells what requirements was all values from that sheet need to update in one file and blank value records to the 2nd CSV file is that possible
@CodingIsFun
@CodingIsFun Жыл бұрын
Thank you very much for watching the video and your comment. Your requirements are well noted. Yet, I receive many requests for creating individual solutions. As much as I want to help, I simply do not find the time in my daily schedule to develop & test all the different requests. I hope you can understand. Thank you!
@beboldermedia
@beboldermedia 2 жыл бұрын
Not sure if you will see this but wondering what to do when I see this error: InvalidWorksheetName: Invalid Excel character '[]:*?/\' When looking at my datafile I noticed some Customer_Name's have "/" in them. Any help?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching the video. Have you watched it until the end? In the last example, we took care of the special characters: kzfaq.info/get/bejne/hL10Za-EuNvdmnk.html Code: github.com/Sven-Bo/split-workbooks-python/blob/master/3rd_Example/Split_Workbook.py I hope this helps!
@xHeroGirlx2
@xHeroGirlx2 8 ай бұрын
Nice! Thank you! How can I split one excel file with multiple sheets into multiple files with multiple sheets based on one row/value in each sheets?
@CodingIsFun
@CodingIsFun 8 ай бұрын
Thank you very much for watching the video and your comment. Your requirements are well noted. Yet, I receive many requests for creating individual solutions. As much as I want to help, I simply do not find the time in my daily schedule to develop & test all the different requests. I hope you can understand. Thank you!
@yanhaolin3696
@yanhaolin3696 Жыл бұрын
I am running into errors with Unique_value where it keeps saying “first arguement must be string or compli pattern.” Also when I do + ‘.xlsx’ it says something like cannot combine float and string. But the code works and I can see the files created… any explanation on that?
@CodingIsFun
@CodingIsFun Жыл бұрын
Thanks for watching and for your question. Hard to tell from a distance why you are facing an error. Sorry that I cannot help.
@jhonnidarshan3094
@jhonnidarshan3094 2 жыл бұрын
How do we get get new excel sheets by Columns that’s mean if it has 15columns and I need only 3 specific columns
@CodingIsFun
@CodingIsFun 2 жыл бұрын
You could do it like this: NOTE: YOU NEED TO ADJUST THE CODE SNIPPET BELOW (change the placeholders) import pandas as pd # pip install pandas import os df = pd.read_excel('NAME_OF_WORKBOOK.xlsx') column_name = 'NAME_OF COLUMN' unique_values = df[column_name].unique() for unique_value in unique_values: df_output = df[df[column_name].str.contains(unique_value)] # Select the ones you want df_output = df_output[['column1','column2', 'column3']] output_path = os.path.join(os.getcwd(), str(unique_value) + '.xlsx') df_output.to_excel(output_path, sheet_name=unique_value[:31], index=False)
@donteo3105
@donteo3105 3 жыл бұрын
great sharing thanks! just to check if want to split by country And by product? thanks
@CodingIsFun
@CodingIsFun 3 жыл бұрын
Thanks for watching the video. An easy way would be to convert the existing code into a function. See an example below: import pandas as pd # pip install pandas import os df = pd.read_excel('Financial_Sample.xlsx') def split_by_columns(column_name): unique_values = df[column_name].unique() for unique_value in unique_values: df_output = df[df[column_name].str.contains(unique_value)] output_path = os.path.join('output', str(unique_value) + '.xlsx') df_output.to_excel(output_path, sheet_name=unique_value[:31], index=False) split_by_columns('Segment') split_by_columns('Product')
@donteo3105
@donteo3105 3 жыл бұрын
@@CodingIsFun you’re awesome thanks alot 👍👍👍
@lupinarsene1657
@lupinarsene1657 Жыл бұрын
Sir, your amazing. Can you please tell me how do I think like the way you do.... using python? What should I do in order to learn or imagine or think the way you do sir? It will be soo useful for my career. Thanks for all the videos
@CodingIsFun
@CodingIsFun Жыл бұрын
Hey there! Thank you so much for your kind words. Regarding your question: It basically just boils down to practice. The more you code, the more you'll develop a deeper understanding and intuition for it. Interestingly, creating these KZfaq videos has helped me a lot in getting better at coding as well. Keep working on projects, experimenting, and learning from your mistakes. Good luck on your journey! 🚀
@lupinarsene1657
@lupinarsene1657 Жыл бұрын
@@CodingIsFun Thanks for replying sir. I will start practicing. Thank you!
@mikebeaton5675
@mikebeaton5675 Жыл бұрын
Hey - I took your code and just changed the Column Name to work with my use case. After going through a few thousand seperations, I'm getting this error -- TypeError: first argument must be string or compiled pattern -- any thoughts?
@CodingIsFun
@CodingIsFun Жыл бұрын
Thanks for watching. Are you using this code? github.com/Sven-Bo/split-workbooks-python/blob/master/3rd_Example/Split_Workbook.py And which line is causing the error?
@varatharajinpayanangal
@varatharajinpayanangal 2 жыл бұрын
How do I format the output excel sheet? Like column width, auto adjust size of column to content
@CodingIsFun
@CodingIsFun 2 жыл бұрын
You can use the XlsxWriter package: xlsxwriter.readthedocs.io/working_with_pandas.html
@ahmedhelal920
@ahmedhelal920 2 жыл бұрын
wooow very nice , how can we add specific excel file from these to company pdf and print it automatic
@vladimir7759
@vladimir7759 3 жыл бұрын
hi friend, can you make a video or tell me which library to use to split the text in a cell into columns? thanks!!!
@CodingIsFun
@CodingIsFun 3 жыл бұрын
Thanks for watching the video. You could use the pandas library for this. Example below: df['First Name'] = df['Name'].str.split(',', expand=True)[1] df['Last Name'] = df['Name'].str.split(',', expand=True)[0] Here are some more examples: www.geeksforgeeks.org/python-pandas-split-strings-into-two-list-columns-using-str-split/ Hope this helps!
@sudharshan574
@sudharshan574 3 жыл бұрын
Great tutorial, how to split file by column values, if one cell has 20 letters. because it throughs me error
@Viralvlogvideos
@Viralvlogvideos 7 ай бұрын
Pandas is failing to process 100 million records around 18.5 gb csv file. Any idea how to split into smaller records, I also tried polars it is generated only one file of 100k
@CodingIsFun
@CodingIsFun 7 ай бұрын
Thank you for watching! Unfortunately, the question you asked is not related to the video content. If you have any questions or need help, I suggest you join my Discord channel at pythonandvba.com/discord . Just make sure to read the forum rules before posting any questions.
@A07Raiblo
@A07Raiblo 2 жыл бұрын
how can i take just 1 country and save it in a neu file?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Like so: pastebin.com/D3rQHQMz Happy Automating! :)
@bhargavit1108
@bhargavit1108 2 жыл бұрын
Hii I'm getting issue like invalidworksheetName: excel worksheet name should must be
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Please use the updated code from GitHub: github.com/Sven-Bo/split-workbooks-python/blob/master/1st_Example/Split_Workbook.py What I have added to prevent this error: www.screencast.com/t/stjRN53vC
@verma_ayush
@verma_ayush 2 жыл бұрын
if want this data in a template can this be done?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
You could do that with 'openpyxl' or 'xlwings'
@arnabmaity8123
@arnabmaity8123 2 жыл бұрын
hi sir it was nice ,but i have some another issue ...i tried converting large .csv file into .excel file its generating but it is deleting the space between column detas ......so how do i solve this?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching. Hard to tell from a distance as I do not know your code.
@dcdien07
@dcdien07 2 жыл бұрын
Thanks for the video, but i have a problem on the 12th line. The result is "Output\\D200.xlsx", which is not correct for window. Any suggestions on how to fix it please
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching the video. What exactly is the error? Is the file path incorrect?
@dcdien07
@dcdien07 2 жыл бұрын
@@CodingIsFun [Errno 22] Invalid argument: 'output\\
@CodingIsFun
@CodingIsFun 2 жыл бұрын
@@dcdien07 Have you watched the video until the end? In the last example, we took care of the special characters: kzfaq.info/get/bejne/hL10Za-EuNvdmnk.html Code: github.com/Sven-Bo/split-workbooks-python/blob/master/3rd_Example/Split_Workbook.py I hope this helps!
@Chrisfagan8881
@Chrisfagan8881 2 жыл бұрын
Found an issue using this line in the code recently. df_output=df[df[column_name].str.contains(unique_value)] Using .str.contains if we have Guinea & Guinea-Bissau in the columns we want to split by. Guinea-Bissau will be added to the Guinea sheet. Or if we have romania and oman same issue. Way around it was to change to this df_output = df[df[column_name].str.fullmatch(unique_value)]
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Great catch! I have committed changes to the repo. Many thanks for your solution - I appreciate it! :)
@sureshgoku
@sureshgoku 3 жыл бұрын
Just a question can we list validation in python like it's done in excel
@CodingIsFun
@CodingIsFun 3 жыл бұрын
Do you mean like inserting 'DropDown' Menus in Excel?
@sureshgoku
@sureshgoku 3 жыл бұрын
@@CodingIsFun yes the insertion of drop down in specific cells in excel
@jefrin
@jefrin 2 жыл бұрын
Wow....nice. That's what I needed. I want only a specific set of countries in the output folder. Eg. Out of countries like Japan, India, Germany, USA.I want only japan and USA in the output folder.
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching, and glad to hear that you find the solution useful. Regarding your question, instead of using all unique values, you could do the selection as follows: import pandas as pd # pip install pandas import os df = pd.read_excel('NAME_OF_WORKBOOK.xlsx') column_name = 'NAME_OF COLUMN' unique_values = ['Japan', 'USA'] for unique_value in unique_values: df_output = df[df[column_name].str.contains(unique_value)] output_path = os.path.join(os.getcwd(), str(unique_value) + '.xlsx') df_output.to_excel(output_path, sheet_name=unique_value[:31], index=False) I hope this helps!
@jefrin
@jefrin 2 жыл бұрын
@@CodingIsFun Thanks! That worked....
@jefrin
@jefrin 2 жыл бұрын
@@CodingIsFunIn your example, the table begins from row 1. In my file, the table begins from row 5. This code by default detects row 1 as the column header. I will have to delete first few rows to get things done. Are there any solutions for it? Thank you once again...
@CodingIsFun
@CodingIsFun 2 жыл бұрын
@@jefrin, You can specify more arguments in pd.read_excel, like how many rows pandas should skip. See also the official pandas documentation: pandas.pydata.org/docs/reference/api/pandas.read_excel.html Here is also a video where I am going through the different pd.read_excel options: kzfaq.info/get/bejne/ecySY6SJqNu0oas.html To make your life easier, you could also use my Excel Add-in, which will fill out the pd.read_excel arguments for you: kzfaq.info/get/bejne/ptOjaJWQksnRk2w.html
@jefrin
@jefrin 2 жыл бұрын
@@CodingIsFun Thank you very much. Love from India 🇮🇳
@dominospizzach
@dominospizzach 2 жыл бұрын
thanks for this tutorial and all your other videos! I am having issues with the 4th line of the code where I entered the full path of my excel file. I get the following error. Any suggestions on how to fix it please? Can't figure out what I am doing wrong. df = pd.read_excel('C:\Users\blablabla\OneDrive\Desktop\Python\Work_test.xlsx') ^ SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching the video & your question. Could you try to provide the string with a leading 'r', see example below: df = pd.read_excel(r'C:\Users\blablabla\OneDrive\Desktop\Python\Work_test.xlsx') I hope this helps!
@dominospizzach
@dominospizzach 2 жыл бұрын
@@CodingIsFun thanks for replying I’ll test this later. I eventually replaced back slash (\) with forward slash (/) and it worked but unsure why.
@CodingIsFun
@CodingIsFun 2 жыл бұрын
@@dominospizzach probably because you are on a MacOS?! :)
@Kata124you
@Kata124you Жыл бұрын
Hello, does anyone know how to split that excel file into one PDF? For example I do not want multiple excel files I want only one PDF with 5 pages (reports) for each country
@CodingIsFun
@CodingIsFun Жыл бұрын
Perhaps the following answer might help: stackoverflow.com/a/59574470
@turek871
@turek871 3 жыл бұрын
Quick question, how can I keep the formatting from the master file?
@CodingIsFun
@CodingIsFun 3 жыл бұрын
Thanks for watching the video! Excellent question 👍 Unfortunately, Pandas does not keep track of the excel format when loading/importing data. As a workaround, you could apply some styling when exporting the dataframe, e.g. by using ''xlsxwriter" xlsxwriter.readthedocs.io/example_pandas_column_formats.html Taking the example for this tutorial, I applied some styling to the header: pastebin.com/ZNnUgpJr Hope it helps. __________________________ import pandas as pd #pip install pandas import os df = pd.read_excel('Financial_Sample.xlsx') column_name = 'Country' replace_symbols = ['>', '
@EmekKov
@EmekKov 2 жыл бұрын
@@CodingIsFun Hi, can you tell how to keep dates format when doing the split? I suppose it would also lose the original formatting and in some cases it might be important to also keep dates in the same manner as in the master file (and as a date, not string). THX
@CodingIsFun
@CodingIsFun 2 жыл бұрын
@@EmekKov In that case, you want to use xlwings and apply an Excel autofilter. Once you have filtered the data, you can export it to a new workbook. I spend this evening coding out an example for you. Please find the GitHub repository here: github.com/Sven-Bo/apply-autofilter-to-excel Keep in mind, xlwings in version 0.25.3 DOES NOT NATIVELY SUPPORT autofilter. Therefore, I am using the underlying API as a workaround. Apart from looking ugly, keep in mind that it makes your code platform-specific (!). This code works only on Windows. Learn more about here: docs.xlwings.org/en/stable/missing_features.html I hope this helps!
@EmekKov
@EmekKov 2 жыл бұрын
@@CodingIsFun Thank you so much. I will defenetly try this out. That's so nice you've spent time on this case
@sudipmitra968
@sudipmitra968 2 жыл бұрын
how do I have the values in multiple sheets in one excel instead of multiple excel files
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Do you mean something like this? kzfaq.info/get/bejne/mtp-obBquZjDonU.html
@trendingnow3973
@trendingnow3973 2 жыл бұрын
By using your code, "can only .str accessor with string valuse!" This error will shown?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Before using str.contains, please convert your column to string values: df['ColumnName'] = df['ColumnName'].astype(str) I hope this helps!
@trendingnow3973
@trendingnow3973 2 жыл бұрын
@@CodingIsFun thanks its worked
@cEruStorm
@cEruStorm 2 жыл бұрын
How do I do this for a csv file?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching the video. Instead of pd.read_excel(), you could use pd.read_csv(). I hope this helps!
@SKini18
@SKini18 2 жыл бұрын
how to filter the data by each item from column "Segment" and also by column "Country" and write those filters in separate sheets ? For Example : Government and France, Government and Germany, Government and Italy....and goes on. Then it should repeat the same using another item from column name "Segment" like "Midmarket" and "Frace", "Midmarket and Germany, "Midmarket and Italy, etc... . How to do this using loop ?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Here you are: from pathlib import Path import pandas as pd # pip install pandas EXCEL_FILE_PATH = Path.cwd() / "Financial_Sample.xlsx" df = pd.read_excel(EXCEL_FILE_PATH) segments = df["Segment"].unique() replace_symbols = [">", "
@SKini18
@SKini18 2 жыл бұрын
@@CodingIsFun thanks for your help.
@mcso204
@mcso204 Жыл бұрын
Hello, Thank you so much for this info, only issue I am having is it is converting my dates from 1/31/2023 to 1/31/2023 12:00:00 AM... Any advice?
@CodingIsFun
@CodingIsFun Жыл бұрын
Thanks for watching the video. Can you perhaps elaborate on what it is you're trying to achieve? Please provide some additional info. Thanks!
@mcso204
@mcso204 Жыл бұрын
@@CodingIsFun Hey figured this part out, But I am struggling with how to subtotal on column and append the total to the bottom of the output files, do you have a video that covers this?
@CodingIsFun
@CodingIsFun Жыл бұрын
@@mcso204 I do not have a video about it, but here is the code to do it. I hope it helps! :) pastebin.com/JYv3inX7 from pathlib import Path import pandas as pd # Define & create output directory output_dir = Path(__file__).parent / "output" output_dir.mkdir(parents=True, exist_ok=True) # Define Excel file path excel_file = Path(__file__).parent / "Financial_Sample.xlsx" df = pd.read_excel(excel_file) column_name = "Segment" unique_values = df[column_name].unique() # Iterate over unique values and export to Excel for unique_value in unique_values: # Subset dataframe by unique value df_output = df[df[column_name].str.fullmatch(unique_value)].copy() # Select only numeric columns numeric_cols = df_output.select_dtypes(include=[float, int]).columns # Compute the row and column totals row_totals = df_output[numeric_cols].sum(axis=1) col_totals = df_output[numeric_cols].sum() # Add the row and column totals to the dataframe df_output.loc['Total'] = col_totals total_row = pd.Series(sum(row_totals), index=[f'Total ({unique_value})']) df_output.loc[:, 'Total'] = pd.concat([row_totals, total_row]) # Export to Excel with total row output_path = output_dir / f"{unique_value}.xlsx" df_output.to_excel(output_path, sheet_name=unique_value[:31], index=False)
@mcso204
@mcso204 Жыл бұрын
@@CodingIsFun That works, only one thing. I'd like it to calculate only one of the columns not all numeric columns. if possible.
@CodingIsFun
@CodingIsFun Жыл бұрын
@@mcso204 Feel free to adjust the code. It was just an example;)
@muhammadsaadsiddiqui2539
@muhammadsaadsiddiqui2539 8 ай бұрын
contains function not working in python 3.11 it says in error (AttributeError: 'function' object has no attribute 'contains')
@CodingIsFun
@CodingIsFun 8 ай бұрын
Thanks for watching. Please clone the GitHub repo and try again. Thanks!
@rosalyna_24
@rosalyna_24 2 жыл бұрын
i'm having this error and i don't seem to find a way to fix it No engine for filetype: ''
@CodingIsFun
@CodingIsFun 2 жыл бұрын
With this bit of information, I cannot really help.
@rosalyna_24
@rosalyna_24 2 жыл бұрын
@@CodingIsFun and this is the code import pandas as pd #pip install pandas from pathlib import Path import os from pyecharts import options as opts #pip install pyecharts from pyecharts.charts import Bar, Tab data_frame = pd.read_csv(r'C:\\Users\\Nesrine\\Desktop\\Game\\SteamCharts.csv',encoding= 'unicode_escape') data_frame.to_excel(r'C:\\Users\\Nesrine\\Desktop\\Game\\StreamCharts.xlsx', index=None, header=True) #separate the data into multiple workbooks based on the game name df = pd.read_excel(r'C:\\Users\\Nesrine\\Desktop\\Game\\StreamCharts.xlsx') #reading the dataframe column_name = 'gamename' replace_symbols = ['', ':', '"', '/', '\\\\', '|', '\?', '\!', '\*', '\]', '\+', '\(', '\)', '®', '_', '-','.'] df[column_name] = df[column_name].replace(replace_symbols, '', regex=True).str.strip().str.title() unique_values = df[column_name].unique() #list of unique values of that column for unique_value in unique_values: df_output = df[df[column_name].str.contains(unique_value)] output_path = os.path.join('output', unique_value +'.xlsx') with pd.ExcelWriter(output_path) as writer: # xlsx file type which is file extension for Microsoft Excel Open XML Spreadsheet df_output.to_excel(writer) #df_output.to_excel(output_path, sheet_name=unique_value, index=False)
@mattkhoury4241
@mattkhoury4241 6 ай бұрын
what happens if you have multiple sheets?
@CodingIsFun
@CodingIsFun 6 ай бұрын
Then you need to adjust the code like so: pastebin.com/aFh4FeKp
@sudharshan574
@sudharshan574 3 жыл бұрын
i am getting error : InvalidWorksheetName: Excel worksheet name 'i cant show complete file name in public' must be
@CodingIsFun
@CodingIsFun 3 жыл бұрын
Thanks for watching the video. I have adjusted the code to limit the sheet name length to a maximum of 31 characters: import pandas as pd # pip install pandas import os df = pd.read_excel('NAME_OF_WORKBOOK.xlsx') column_name = 'NAME_OF COLUMN' unique_values = df[column_name].unique() for unique_value in unique_values: df_output = df[df[column_name].str.contains(unique_value)] output_path = os.path.join(os.getcwd(), str(unique_value) + '.xlsx') df_output.to_excel(output_path, sheet_name=unique_value[:31], index=False)
@sudharshan574
@sudharshan574 3 жыл бұрын
@@CodingIsFun Thanks, that helped me a lot. i request you please make a video adding pivot table for each file that we split, for example if we split based on clients name and we want to show pivot for that particular client in sheet2. and split based on client name.
@CodingIsFun
@CodingIsFun 3 жыл бұрын
@@sudharshan574 Thanks for your video suggestion. Sounds like an exciting video topic. I cannot make any promises, but l will see what I can do.
@ahmadmojeed5507
@ahmadmojeed5507 2 жыл бұрын
@@CodingIsFun i would like to increase mine because the column is more than 31 character which i can delete any, please how do i go about it?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
@@ahmadmojeed5507 31 characters is the maximum length set by Excel. There is no workaround
@mrityunjayanrg787
@mrityunjayanrg787 Жыл бұрын
hey i am getting this kind of error "ValueError: Cannot mask with non-boolean array containing NA / NaN values"
@CodingIsFun
@CodingIsFun Жыл бұрын
Thanks for watching. Are you using this code? github.com/Sven-Bo/split-workbooks-python/blob/master/3rd_Example/Split_Workbook.py And which line is causing the error?
@mrityunjayanrg787
@mrityunjayanrg787 Жыл бұрын
@@CodingIsFun Traceback (most recent call last): File "C:\Users\Mrityunjayan R G\PycharmProjects\excelss\main.py", line 21, in df_output = df[df[column_name].str.fullmatch(unique_value)] File "C:\Users\Mrityunjayan R G\PycharmProjects\excelss\venv\lib\site-packages\pandas\core\frame.py", line 3495, in __getitem__ if com.is_bool_indexer(key): File "C:\Users\Mrityunjayan R G\PycharmProjects\excelss\venv\lib\site-packages\pandas\core\common.py", line 144, in is_bool_indexer raise ValueError(na_msg) ValueError: Cannot mask with non-boolean array containing NA / NaN values
@mrityunjayanrg787
@mrityunjayanrg787 Жыл бұрын
@@CodingIsFun this is error i am facing can u able to help me please
@CodingIsFun
@CodingIsFun Жыл бұрын
@@mrityunjayanrg787 Can you try to change the line: df_output = df[df[column_name].str.fullmatch(unique_value)] to this: df_output = df[df[column_name].str.fullmatch(unique_value, na=False)]
@mrityunjayanrg787
@mrityunjayanrg787 Жыл бұрын
@@CodingIsFun nope not happening and it throws so much error
@rajkumarrajan8059
@rajkumarrajan8059 Жыл бұрын
How to load an excel table into the dataframe?
@CodingIsFun
@CodingIsFun Жыл бұрын
import pandas as pd import xlwings as xw # Open the workbook and sheet wb = xw.Book('path_to_workbook.xlsx') sheet = wb.sheets['Sheet1'] # Read the table data into a pandas dataframe df = sheet.range('TableName').options(pd.DataFrame, index=False, header=True).value # Close the workbook wb.close()
@rajkumarrajan8059
@rajkumarrajan8059 Жыл бұрын
@@CodingIsFun It is okay. But this code doesn't include the Table Headers. It is considering the first row as my header. The counting starts from 2 second row
@rajkumarrajan8059
@rajkumarrajan8059 Жыл бұрын
@@CodingIsFun Thank you for your response. That is AWESOME
@PTSLKuansing
@PTSLKuansing 2 жыл бұрын
excuse me,but how to remove bold text and borders ?
@CodingIsFun
@CodingIsFun 2 жыл бұрын
Thanks for watching. Do you mean how to remove, in general, bold text and borders for a specific Excel cell range using Python?
How Many Balloons Does It Take To Fly?
00:18
MrBeast
Рет қаралды 180 МЛН
Despicable Me Fart Blaster
00:51
_vector_
Рет қаралды 26 МЛН
DAD LEFT HIS OLD SOCKS ON THE COUCH…😱😂
00:24
JULI_PROETO
Рет қаралды 15 МЛН
EASILY Combine Multiple Excel Sheets Into One With This Trick
8:48
Kenji Explains
Рет қаралды 266 М.
Read Giant Datasets Fast - 3 Tips For Better Data Science Skills
15:17
Python Simplified
Рет қаралды 49 М.
Create Stunning Python GUIs in 10 Minutes With Drag & Drop
11:38
Coding Is Fun
Рет қаралды 28 М.
Day in the Life of a Data Analyst - SurveyMonkey Data Transformation
1:17:14
Shashank Kalanithi
Рет қаралды 3,4 МЛН
Automate Excel Work with Python and Pandas
21:29
John Watson Rooney
Рет қаралды 93 М.
VBA: Split data into Separate Workbooks
11:25
PK: An Excel Expert
Рет қаралды 103 М.
How Many Balloons Does It Take To Fly?
00:18
MrBeast
Рет қаралды 180 МЛН