Convert Parquet To CSV in Python with Pandas | Step by Step Tutorial

  Рет қаралды 11,704

DataEng Uncomplicated

DataEng Uncomplicated

Күн бұрын

Step by step tutorial on how to convert a single parquet file to a csv file using python with the pandas library. This video covers how to convert the data with without compression and with compression.
Timeline:
00:00 introduction
0:20 Read Parquet file Into Python
1:31 Write Without Compression
3:08 Write with gzip Compression
#python
#pandas

Пікірлер: 22
@aishamuhammed241
@aishamuhammed241 2 жыл бұрын
thank you so much, this helped me fix an urgent issue at work. subscribed!
@DataEngUncomplicated
@DataEngUncomplicated 2 жыл бұрын
Thanks Aisha! I'm glad it was helpful!
@felipecabral7034
@felipecabral7034 2 жыл бұрын
Thanks! Helped a lot
@DataEngUncomplicated
@DataEngUncomplicated 2 жыл бұрын
I'm glad it was helpful Felipe!
@multitaskprueba1
@multitaskprueba1 2 жыл бұрын
Fantastic video! However, how would you convert in the other way around a FOLDER of CSV files to parquet files?
@DataEngUncomplicated
@DataEngUncomplicated 2 жыл бұрын
Thanks! the read CSV method should be able to handle a folder of CSV files!
@jorgeirai27
@jorgeirai27 Жыл бұрын
Hi! Sorry, i have a question. I`ve already read my parquet file with pandas and succesfully modified it. But i don`t know how to resafe my modified parquet as a new parquet file. Do you know how to do it? Thank you
@DataEngUncomplicated
@DataEngUncomplicated Жыл бұрын
Hi Faze, To resave your data as a new parquet file, using aws data wrangler use the s3.to_parquet method: aws-sdk-pandas.readthedocs.io/en/stable/stubs/awswrangler.s3.to_parquet.html
@oscarandresgarnica6354
@oscarandresgarnica6354 Жыл бұрын
Great video. Although, I am getting this error: ImportError: Unable to find a usable engine; tried using: 'pyarrow', 'fastparquet'. A suitable version of pyarrow or fastparquet is required for parquet support. Trying to import the above resulted in these errors: - Missing optional dependency 'pyarrow'. pyarrow is required for parquet support. Use pip or conda to install pyarrow. - Missing optional dependency 'fastparquet'. fastparquet is required for parquet support. Use pip or conda to install fastparquet.
@DataEngUncomplicated
@DataEngUncomplicated Жыл бұрын
Hey, follow these steps to install pandas correctly: I have a seperate video to solve this issue kzfaq.info/get/bejne/jqx_aM2VrNmxlZ8.html
@Yush214
@Yush214 Жыл бұрын
Hey can you make a video on how to convert a CSV file to an orc file in python
@theoopsiedaisiegaisi
@theoopsiedaisiegaisi 2 жыл бұрын
Hi, i got this error Traceback (most recent call last): File "", line 1, in AttributeError: 'module' object has no attribute 'read_parquet'
@DataEngUncomplicated
@DataEngUncomplicated Жыл бұрын
Hi, did you import AWS Wrangler correctly?
@theoopsiedaisiegaisi
@theoopsiedaisiegaisi Жыл бұрын
@@DataEngUncomplicated may i know if you have any video on how to install it?
@DataEngUncomplicated
@DataEngUncomplicated Жыл бұрын
Wait sorry this video uses Pandas, I confused it with another video :). Yes checkout this video here to install pandas if you haven't already: kzfaq.info/get/bejne/jqx_aM2VrNmxlZ8.html
@nivethamurali359
@nivethamurali359 Жыл бұрын
Unfortunately, it's not working for complex parquet file.
@DataEngUncomplicated
@DataEngUncomplicated Жыл бұрын
Strange, what do you define as a complex parquet file?
@theoopsiedaisiegaisi
@theoopsiedaisiegaisi Жыл бұрын
hi i got this error argument of type 'method' is not iterable
@DataEngUncomplicated
@DataEngUncomplicated Жыл бұрын
Hi, Sorry it's hard to troubleshoot without knowing more details
@arielgg5039
@arielgg5039 11 ай бұрын
Can I do this with PyCharm?
@DataEngUncomplicated
@DataEngUncomplicated 11 ай бұрын
Of course! It's just python code so you can use any ide
@miaembroidery256
@miaembroidery256 Жыл бұрын
This INCREDIBLE trick will speed up your data processes.
12:54
Rob Mulla
Рет қаралды 260 М.
What is Apache Parquet file?
8:02
Riz Ang
Рет қаралды 73 М.
KINDNESS ALWAYS COME BACK
00:59
dednahype
Рет қаралды 162 МЛН
WHAT’S THAT?
00:27
Natan por Aí
Рет қаралды 12 МЛН
Nutella bro sis family Challenge 😋
00:31
Mr. Clabik
Рет қаралды 14 МЛН
What it feels like cleaning up after a toddler.
00:40
Daniel LaBelle
Рет қаралды 56 МЛН
CSV To JSON With Python Pandas
8:16
DataEng Uncomplicated
Рет қаралды 11 М.
Read JSON file from S3 With AWS Lambda in python with Amazon EventBridge Rule
11:14
Speed Up Data Processing with Apache Parquet in Python
10:12
NeuralNine
Рет қаралды 8 М.
An introduction to Apache Parquet
5:16
Learn Data with Mark
Рет қаралды 36 М.
Convert PY to EXE Automatically
3:05
PyTutorials
Рет қаралды 398 М.
How to work with big data files (5gb+) in Python Pandas!
11:20
TechTrek by Keith Galli
Рет қаралды 36 М.
What is Pandas? Why and How to Use Pandas in Python
10:08
Python Programmer
Рет қаралды 598 М.
How I would learn to code (If I could start over)
9:16
Jason Goodison
Рет қаралды 4,6 МЛН
Showing a Craigslist scammer who's boss using Python
5:27
Engineer Man
Рет қаралды 6 МЛН
KINDNESS ALWAYS COME BACK
00:59
dednahype
Рет қаралды 162 МЛН