No video

Beginners Guide To Web Scraping with Python - All You Need To Know

  Рет қаралды 263,425

Tinkernut

Tinkernut

Күн бұрын

The web is full of data. Lots and lots of data. Data prime for scraping. But manually going to a website and copying and pasting the data into a spreadsheet or database is tedious and a time consuming. Enter web scraping! This guide will show you how to get started in scraping web data to your hearts content in 8 minutes!
_____________________________
📲🔗🔗📲 IMPORTANT LINKS 📲🔗🔗📲
_____________________________
• 💻PROJECT PAGE💻 - github.com/gig...
• Python 3 - www.python.org...
• BeautifulSoup - www.crummy.com...
• Scraper Testing Website - quotes.toscrape...
• Thonny - thonny.org/
_____________________________
📢📢📢📢 Follow 📢📢📢📢
____________________________
redd.it/5o3tp8
/ tinkernut_ftw
/ tinkernut
/ tinkernut
00:00 Introduction
00:42 Setup
01:16 Background
02:23 Legality Concerns
02:51 Writing The Code
06:47 Conclusion

Пікірлер: 188
@michaelmagill5466
@michaelmagill5466 2 жыл бұрын
This editing is fantastic, the explanations are clear and concise and completely without obfuscation. You, sir, are a gentleman.
@chanson8508
@chanson8508 5 ай бұрын
Big faxxx! so many nonsense intro to scraping vids, but not this one : ))
@Greshma123
@Greshma123 4 ай бұрын
I’m sorry 😢 I’m not going
@SonicFusedWith_Goku
@SonicFusedWith_Goku 3 ай бұрын
Bro this is crazy
@SonicFusedWith_Goku
@SonicFusedWith_Goku 3 ай бұрын
I was trying to make a code to get stuff from my math homework website
@Sivarajansam931
@Sivarajansam931 2 жыл бұрын
When world needed him the most, He returned.
@benjaminofurhie8178
@benjaminofurhie8178 2 ай бұрын
I have searched for scraping tutorials for the last one month, but this is the BEST .Thanks so much
@japhethmutuku8508
@japhethmutuku8508 19 күн бұрын
I can teach you web scraping form the basics to advanced......if that may help you can reach to me
@JoaquinRoibal
@JoaquinRoibal Жыл бұрын
Great introduction. Clear, concise and covered related topics without being distracting. I look forward to your other videos on Python.
@JccChanco
@JccChanco 2 ай бұрын
So far in my life, this has been the smoothest learning process I have ever experienced. Thank you kind sir!
@kedrovasuma2857
@kedrovasuma2857 2 жыл бұрын
This smart man is still alive
@ten132
@ten132 2 жыл бұрын
I was abput to comment the same lmao.
@HayCorvus
@HayCorvus 4 ай бұрын
I grew up in the early youtube days. I was a enamored by the computers knowledge that I could only get from channels like Tinkernut. There really was no schools that offered nuanced coding/web lessons when I was growing up. It wasn't until I went to college and got my degree in Computer Science that I'd be able to build a foundation in computational theory and all sorts of other fun subjects related to computers. Thanks for helping me along the way to that journey, Tinker!
@lemonbread378
@lemonbread378 Жыл бұрын
currently planning for my computer science A level project and wanted to learn what this web scraping thingamejiggy was all about this video was an amazing introduction! simple, clear, but not over proffessional didn't leave me feeling overwhelmed, and i'm going to watch more of your tuts now, cheers mate!
@sauceboss38
@sauceboss38 2 жыл бұрын
This is exactly what I was looking for. Very concise and helpful, thank you!
@Flying_turnip187
@Flying_turnip187 8 күн бұрын
Very cool project ! I am a beginner in Python and this was right up my alley. I think Data science is going to be my forte. Thanks so much for this !!
@adonisg.j7430
@adonisg.j7430 3 күн бұрын
We should connect
@mrklean0292
@mrklean0292 4 ай бұрын
Man... I've seen other web scraping tutorials and they take you ten miles down the road and through all types of advanced garbage at you. Granted, I know what you have shown here is the quick and easy way, but that's all I have wanted to get an understanding of, what it is, and how it basically works. Thank you.
@Syndesi
@Syndesi 2 жыл бұрын
cool tutorial :D for more complicated data I use xpath, although its syntax is a bit weird at first. furthermore: validate, validate and validate your data. you do not want a program which crashes randomly, only because a value is missing, empty or malformed :)
@algj
@algj 2 жыл бұрын
This is crazy to see your videos again being recommended :o it has been years since I saw your last video!
@renaaaa05
@renaaaa05 26 күн бұрын
I was given a task in my internship that involved web scraping and this was very helpful, thank you!
@santiagoSosaH
@santiagoSosaH 2 жыл бұрын
wooooow it's been years that i didn't see a video about tinkernut. i think about 10 years ago i learned sql and php with your tutorial about making a webpage with users passwords etc. man so nice to see a video of you.
@bng3832
@bng3832 2 жыл бұрын
I swear to god you are the best! I know see why youtube dont recommend great videos. Its because youtube dont want people to study tech!!
@Raxer_th
@Raxer_th 2 жыл бұрын
This channel used to have like 100k views. Now its down to just less than 10k. Idk why. When I was around 13, I wanted to make an fps game and found his video to be very interesting. I follow this channel since then. Tinkernut was the reason I started learning programming. After watching his HTML tutorial (create a website from scratch). Even though I neither have com-sci degree nor working as a programmer, I'm still learning python during my freetime. Thank you Daniel.
@toniphillips9269
@toniphillips9269 2 жыл бұрын
Yeah poops yeah lol iaooapaoopp lol oowss d’s aIA
@proxyscrape
@proxyscrape Жыл бұрын
I love that you used a Raspberry Pi in this tutorial. It's amazing to mess around on and do little experiments.
@Squid666
@Squid666 4 ай бұрын
I always end up back here when I need a refresher on scraping ❤ thank you!
@InspiredInsights4U
@InspiredInsights4U 2 жыл бұрын
A survey businessman could use web scraping to scrape a competitors website for product pricing to include product numbers photos prices and then use this to monitor their price changes and or adjust their own prices on their website to stay just a slight bit more competitive
@AirmanKolberg
@AirmanKolberg 2 жыл бұрын
Web scraping is to copying and pasting manually, as copying and pasting manually is to using your eyeballs, memorising, then typing it into a file. There is no difference between surfing the web and web scraping. One is just faster. Like how copy/pasting something from Wikipedia is faster than reading and re-writing it.
@jalanmcrae
@jalanmcrae Жыл бұрын
Yes, automation is a huge time saver 👍🏾
@TheJoyOfGaming
@TheJoyOfGaming 2 жыл бұрын
haha awesome man. I don't even do coding but couldn't resist following along just to try it! Cheers!
@NasimKhan-tk3ij
@NasimKhan-tk3ij Жыл бұрын
Overall, I highly recommend this video to anyone who is interested in learning Python. It is a comprehensive and informative resource that will teach you need to know to get started with this powerful programming language.
@desecrated.eviscerated
@desecrated.eviscerated 9 ай бұрын
if you get an error, try replacing the line of code: file = open('scrapped_quotes.csv', 'w', encoding='utf-8', newline='')
@donsurlylyte
@donsurlylyte 2 жыл бұрын
dude, that intro proves you have a bright future in infomercials!
@SarahGamigbigboss
@SarahGamigbigboss Жыл бұрын
Funny how it's titled Beginners Guide to Scraping and once he's done with the introduction and starts typing a bunch of codes that " beginners" have absolutely no clue how to do... Thanks, man great help!
@myriadtechrepair1191
@myriadtechrepair1191 2 жыл бұрын
Our lord has returned.
@webslinger2011
@webslinger2011 2 жыл бұрын
Your technological code geniusness shall be added to my own. Seriously looking for this. Thanks!
@intellectualhybrid2
@intellectualhybrid2 2 жыл бұрын
Love the Borg reference XDD
@benjaminblack8653
@benjaminblack8653 2 жыл бұрын
So glad to see you posting again! I missed your videos so much. I believe my first video of yours was either How to Setup a Webserver or How to Make an Operating System. Both excellent videos!
@Geeksmithing
@Geeksmithing 2 жыл бұрын
Hey man, this is great!! Happy to another video from ya!
@slattbizz22
@slattbizz22 2 күн бұрын
Honestly this is just what I needed 😭
@liamhughes7093
@liamhughes7093 Жыл бұрын
Great video. With the phrase "web scraper", I can't help but picture a function that returns a digital box chevy with candy paint, 26" chrome rims, tinted windows, and triple 15" subs in the trunk with some Too $hort going. I hope someone else from Northern California is thinking the same thing, and cracks up seeing this. But thank you for your fantastic educational video! cheers.
@teomanefe
@teomanefe 2 жыл бұрын
I actually needed this!
@wrzq
@wrzq 7 ай бұрын
Beautiful tutorial, exactly what I've been looking for. Thanks a lot, Man!
@fearlessAx
@fearlessAx Жыл бұрын
Hey, I'm getting "NameError: name 'page_to_scrape' is not defined"
@lucasn0tch
@lucasn0tch 2 жыл бұрын
Long time no see. This may be useful for tracking stock for a PS5/Xbox/Switch/GPU in these times.
@JoaoPedro-ki7ct
@JoaoPedro-ki7ct 2 жыл бұрын
Even a Switch is being scalped? I heard about PS5, Xbox Series X|S, GPUs but not about the Switch itself.
@DTMPro
@DTMPro 2 жыл бұрын
Where can we find out if we are allowed to scrape data from a specific website so that eventually we don't end up in trouble? Does scraping code/process works the same way for scraping product prices, e.g. trying to replicate camel for amazon or that takes additional authorization from amazon?
@Tinkernut
@Tinkernut 2 жыл бұрын
Excellent question! All popular websites have a scraping/crawling text file called "robots.txt". This tells what can and can't be scraped from a website. Here is an example of Amazon's robots.txt file (spoiler, you can't scrape much) www.amazon.com/robots.txt
@jimavictor6022
@jimavictor6022 2 жыл бұрын
@@Tinkernut what about those non popular websites with no robot.txt file
@JoaoPedro-ki7ct
@JoaoPedro-ki7ct 2 жыл бұрын
@@jimavictor6022 As long as you don't scrape things like other people's documents from governamental sites or usernames plus passwords you should be fine with the rest. What website owners are really worried about are their website availability (whether they are online or offline) and bandwidth usage as they pay X for X amount of gigabytes consumed. (they pay for each gigabyte they send and receive from users) So as long as you don't consciously/unconsciously take down their site you're fine.
@JoaoPedro-ki7ct
@JoaoPedro-ki7ct 2 жыл бұрын
@@jimavictor6022 On top of that they have their automated way to detect bots, the worst that can happen is getting your IP "banned" or simply restricted from viewing their webpages, that will happen way, way, way... before you getting sued by them.
@jimavictor6022
@jimavictor6022 2 жыл бұрын
@@JoaoPedro-ki7ct I really appreciate the reply. Thank you..
@arjunaudupi7956
@arjunaudupi7956 2 жыл бұрын
@tinkernut you are the reason for me being a software developer.. Thanks dude. Keep up the good work..
@kenjohnsiosan9707
@kenjohnsiosan9707 Жыл бұрын
it's a coincidence that I have a task to scrape data and format it to CSV then send it to email. thank you for this tutorial, sir.
@craftedpixel
@craftedpixel 2 жыл бұрын
The legend is back!
@Web.Scraping
@Web.Scraping 14 күн бұрын
Fantastic video. Short and useful 👍
@KowboyUSA
@KowboyUSA 2 жыл бұрын
Just the inexpensive project I needed.
@lundebc
@lundebc 2 жыл бұрын
Thanks for this tutorial, Looking forward to the next part.
@YeshuaIsTheTruth
@YeshuaIsTheTruth Жыл бұрын
These are the kinds of programming videos we need!
@htstube1
@htstube1 Жыл бұрын
great video! seems very straight forward and easy to follow. I will be trying it out in the next day or two
@mmuneebahmed
@mmuneebahmed 2 жыл бұрын
Thanks for sharing the expertise! However, I get the following error when running the code. writer.writerow([quote.text, author.text]) UnicodeEncodeError: 'latin-1' codec can't encode character '\u201c' in position 0: ordinal not in range(256)
@sagarnewpane8549
@sagarnewpane8549 2 жыл бұрын
I need more content on Rasberry PICO !!
@dugumayeshitla3909
@dugumayeshitla3909 Жыл бұрын
One of my favorite channels for learning ... you rock
@martinmcbrown6437
@martinmcbrown6437 13 күн бұрын
Ok, so this is amazing, thank you! How would you generalize a scraper like I want to scrape all the news sites in the world and extract the main articles?
@gamerguy9533
@gamerguy9533 4 ай бұрын
Thanks! Super basic but it was what I needed to make my code start working!
@DroidEagle
@DroidEagle 2 жыл бұрын
dude where were u?
@colinbrown6629
@colinbrown6629 2 ай бұрын
Amazing video to get you started with scraping, thanks!
@pulp6667
@pulp6667 2 жыл бұрын
Thank you for this video I created another scraper for eth, it's rough but it's my first and I am so happy
@Code_Play_com
@Code_Play_com 5 ай бұрын
Very practical and helpful video with very detailed explanation!
@silversurfer3837
@silversurfer3837 22 күн бұрын
Helpful indeed, thanks!
@deepvoyager01
@deepvoyager01 6 ай бұрын
Thank you for the video it helped me to understand how scrapper works
@redentorg.bucalingjr.6320
@redentorg.bucalingjr.6320 Ай бұрын
Very nice presentation...
@mrmxyzptlk8175
@mrmxyzptlk8175 Жыл бұрын
Error: "No module named bs4"
@recursion.
@recursion. 11 ай бұрын
Facing the same, were you able to fix it?
@goodbook6865
@goodbook6865 Жыл бұрын
Awesome video! Short and to the point. Thank you!
@KontrolStyle
@KontrolStyle 2 жыл бұрын
well explained, ty
@reghawkins73
@reghawkins73 2 жыл бұрын
I had to add encoding to the line--- file = open("scraped_quotes.csv", "w", encoding='utf-8')
@RodWorldTours-fo6mh
@RodWorldTours-fo6mh 8 ай бұрын
Most well earned subscriber ever
@OtherDalfite
@OtherDalfite 2 жыл бұрын
Halloween intro? At the end of November? This videos been a while in the making huh?😂
@codingmaster24
@codingmaster24 2 жыл бұрын
Best yotuber.
@ArqitectTV
@ArqitectTV Жыл бұрын
What if the data you are searching for is obtainable but is on separate pages within a given site.
@thecryptocheckpoint5083
@thecryptocheckpoint5083 2 жыл бұрын
Wow really great production . Lots of history and info
@mudasir2168
@mudasir2168 Жыл бұрын
Awesome stuff.....much appreciated!
@jackschwabe4929
@jackschwabe4929 10 ай бұрын
great video. very easy to impliment and understand
@Corkyjett
@Corkyjett 2 жыл бұрын
this tutorial was great!! thank you!
@nikro7239
@nikro7239 6 ай бұрын
when I write to csv file for some reason there is always one free row (with literally nothing) between the actual rows with data
@JayD-jn9or
@JayD-jn9or 4 ай бұрын
Thanks for the vid! After a VERY VERY long time i'm getting back into casual coding and looking to casually make some scraping info programs for games with the option to select which info the person wants to see. So if the site allows scraping would it be better to have my app in progress be independant, have checks done once a minute or every dive minutes? Or have the info scraped, processed and posted on a site i create and retrieved for ppl using the the app? That is if i start shareing the app. My concern is annoying the site owners by checking too often, forgive me if its a silly question, i'm not experiance with scraping.
@RigzoTV
@RigzoTV 2 жыл бұрын
Need more advance lessons on scraping.
@harrystone7954
@harrystone7954 2 жыл бұрын
very logical and understandable explanation
@CareerHubSpot
@CareerHubSpot Жыл бұрын
Concise and precise
@dillkhalifa
@dillkhalifa 7 ай бұрын
you owe me bro. i just subscribed to your channel😂😂
@user-vz7ff8ps8k
@user-vz7ff8ps8k 9 ай бұрын
Thanks a lot for this clear video! How would I retrieve more information associated with the quote? For instance I would like to receive and print both the author and the associated tags.
@Warkeds
@Warkeds 2 жыл бұрын
This channel is awesome!!
@HayaBaqir
@HayaBaqir 8 ай бұрын
What are the pips we need to install?
@InvinsableNoob
@InvinsableNoob 2 жыл бұрын
The avatar has returned 🙌
@almutabbil-jn2pt
@almutabbil-jn2pt 2 ай бұрын
The code didn't create any csv file although I didn't get any error ! why is that?
@santoshpandey23
@santoshpandey23 6 ай бұрын
Thanks, this was very good, can you share any link where you have done the same for teh website which require username and password, can you please share the same, thanks a ton
@IamTheHolypumpkin
@IamTheHolypumpkin 2 жыл бұрын
I just checked a website I want to scalp in a future, but this will be significantly more difficult. I want to get live train schedules but to the live data is inside Java-Script pop-up window.
@JoaoPedro-ki7ct
@JoaoPedro-ki7ct 2 жыл бұрын
You might need to use dedicated tools for that, maybe things like Selenium or something related could help you with that.
@NitishKumarIndia
@NitishKumarIndia Жыл бұрын
I love this man
@Autoscraping
@Autoscraping 7 ай бұрын
An extraordinary piece of video material that has proven highly useful for our new team members. Your generosity is immensely appreciated!
@RENO_K
@RENO_K 5 ай бұрын
I'm only giving a good comments bc my gf told me too, Good video👍
@AllanYacaman
@AllanYacaman Ай бұрын
this seems so refreshing? Why did he stop uploading?
@Jean_villegas
@Jean_villegas 2 ай бұрын
Thanks
@hussainmahady5295
@hussainmahady5295 2 жыл бұрын
Awesome 🔥 bro. Can you make a tutorial about tunnelling and vpns
@Tinkernut
@Tinkernut 2 жыл бұрын
Sure can! I made them both a few years ago ;-) Just search my channel
@jenschristiannrgaard4878
@jenschristiannrgaard4878 8 ай бұрын
how much more difficult is it if I want all sub-pages where you would normally find more information?
@martinrages
@martinrages 2 жыл бұрын
Can websites detect scraping? If so, how do i escape the dutch AIVD
@JoaoPedro-ki7ct
@JoaoPedro-ki7ct 2 жыл бұрын
Yes, they have their ways to detect automated requests, but what they do when they detect "bots" is up to each website.
@LiEnby
@LiEnby 2 жыл бұрын
yes and no, you can check for things like user agent string or try run javascript or something like that, however its actually a really hard problem to solve because a scraping script can look indistinguishable from a browser ..
@royalhermit
@royalhermit 2 жыл бұрын
What is line 10 "w"? I am getting NameError: name 'scraped_quotes' is not defined
@ashrude1071
@ashrude1071 2 жыл бұрын
You probably have a typo
@Tinkernut
@Tinkernut 2 жыл бұрын
Running it with my code from github works fine github.com/gigafide/basic_python_scraping/blob/main/basic_scrape_csv_export.py
@jpsl5281
@jpsl5281 Жыл бұрын
its not working with opentable
@serhiyranush4420
@serhiyranush4420 2 жыл бұрын
Great explanation. Simple and up to the point. Had to look up, though, what the zip function did, but, I guess, it's even better that I had to find it out on my own. However, the quotation marks are not saved right in csv file, instead, they show as 3 weird characters. They do display correctly in Thonny, though. Also, the authors are not put into a separate column, but in the same one with the quote. Also, the quote with a semicolon in it got broken at this semicolon in two parts, and the second part was placed into a separate column. Also, in the csv file open I had to put encoding = "utf-8" after the "w", because I was getting an encoding error. Could this somehow be causing the about problems?
@kaiperdaens7670
@kaiperdaens7670 7 ай бұрын
same problems here(except the third), I am happy that it isn't just me but I dont know how to fix them bc I am new to this.
@lolkek6807
@lolkek6807 6 ай бұрын
what if I want just the first quote?not all
@jackrider798
@jackrider798 2 жыл бұрын
Love your videos, I don’t understand much of the content, but what’s the difference between taking these quotes via code and just copy pasting into a excel sheet? I’m a noob sorry
@JoaoPedro-ki7ct
@JoaoPedro-ki7ct 2 жыл бұрын
You can do it automatically every X amount of time. You can use a "bot" to do something with that data you scraped. I don't use Excel, but if you're talking about what I am thinking, Excel is doing exactly what was talked on this video; web scraping. The thing is that Excel is doing it for you without the need of you programing it first, but that web scraping it does is very, very limited to what tools made for scraping can do.
@Ryan1456100
@Ryan1456100 2 жыл бұрын
In practice? Nothing is different, you get the same result. However, let's say you have a website with 2000 quotes and you need to keep a sheet up to date. That's where a scraper would be useful, as its time you really only need to spend once, plus, at that kind of scale it would be faster to write the code than do it manually.
@jackrider798
@jackrider798 2 жыл бұрын
@@JoaoPedro-ki7ct thank you!
@nikitadorosh244
@nikitadorosh244 5 ай бұрын
Nice stuff, X.
@RobloxPrompt
@RobloxPrompt 4 ай бұрын
Yeah, I thought it was very nice too. For me I use visual studio and I found it to be very helpful since I was able to use python and install the pips for python via command prompt then use visual studio code. Though what my primary application would be for finding different sites from a website. Would be interesting for finding src's and href's. Nice name btw. I like the commonality of it.
@Mcmiddies
@Mcmiddies 2 жыл бұрын
Hey Tinkernut. Welcome back to my feed.
@Pixilmb12
@Pixilmb12 9 ай бұрын
I use IDLE, but for soup reason in the 'soup.findAll' function it says 'nameerror - name 'soup' not defined' :(
@Pixilmb12
@Pixilmb12 9 ай бұрын
Fixed 🤦‍♂
@DrDre001
@DrDre001 2 жыл бұрын
Nice! I need to learn puthon
@havenurmom5375
@havenurmom5375 Ай бұрын
this is entertaining the first thirty seconds lol
Scraping Data from a Real Website | Web Scraping in Python
25:23
Alex The Analyst
Рет қаралды 416 М.
Это реально работает?!
00:33
БРУНО
Рет қаралды 4,3 МЛН
A little girl was shy at her first ballet lesson #shorts
00:35
Fabiosa Animated
Рет қаралды 21 МЛН
Doing This Instead Of Studying.. 😳
00:12
Jojo Sim
Рет қаралды 30 МЛН
Son ❤️ #shorts by Leisi Show
00:41
Leisi Show
Рет қаралды 10 МЛН
Python 101: Learn the 5 Must-Know Concepts
20:00
Tech With Tim
Рет қаралды 1,1 МЛН
What's the difference? Arduino vs Raspberry Pi
6:21
Tinkernut
Рет қаралды 1,8 МЛН
Modern Graphical User Interfaces in Python
11:12
NeuralNine
Рет қаралды 1,5 МЛН
DIY Device Detects Objects With Sound
6:46
Tinkernut
Рет қаралды 24 М.
Web Scraping with Python and BeautifulSoup is THIS easy!
15:51
Thomas Janssen | Tom's Tech Academy
Рет қаралды 27 М.
7 Essential Command Line Tools (2022)
9:12
Tech Craft
Рет қаралды 207 М.
I've been using Redis wrong this whole time...
20:53
Dreams of Code
Рет қаралды 351 М.
Python Tutorial: Web Scraping with BeautifulSoup and Requests
45:48
Corey Schafer
Рет қаралды 1,1 МЛН
25 Nooby Pandas Coding Mistakes You Should NEVER make.
11:30
Rob Mulla
Рет қаралды 266 М.
Это реально работает?!
00:33
БРУНО
Рет қаралды 4,3 МЛН