Low Level Data Extraction from Wikipedia Data with Python

  Рет қаралды 4,169

Jeff Heaton

Jeff Heaton

Күн бұрын

In this video, I present a Python multi-threaded utility that can be used to download and execute code across all of Wikipedia's text.
Code for This Video:
github.com/jeffheaton/present...
Course Homepage: sites.wustl.edu/jeffheaton/t8...
Follow Me/Subscribe:
/ heatonresearch
github.com/jeffheaton
/ jeffheaton
Support Me on Patreon: / jeffheaton

Пікірлер: 9
@tamarisauce1278
@tamarisauce1278 3 жыл бұрын
I have to confess, this is one of my favorite channels. I can't wait to cuild a new pc with an 8 core CPU to be able to run this script.
@PatatjesDora
@PatatjesDora 3 жыл бұрын
Thanks for the effort Jeff!
@Nusinow98
@Nusinow98 3 жыл бұрын
A Wikipedia mining with Colab video would be an awesome idea! :D
@marcelmarceli8246
@marcelmarceli8246 3 жыл бұрын
Hi Jeff, I'm trying to follow up on your tutorial, but it's unclear to me from the very beginning. You download the bz2 file, and then you say "I'll just have python open them up compressed and run over them", how do you do that? What do you mean by that? Thanks.
@microgamawave
@microgamawave 2 жыл бұрын
do another part plss
@cybersphere
@cybersphere 2 жыл бұрын
Interesting, although I'm not sure why you'd want to download all of Wikipedia when you can just scrape the live site.
@AlexGelinas42069
@AlexGelinas42069 2 жыл бұрын
rate limiting
@Tobaman111
@Tobaman111 3 жыл бұрын
Very nice. Couldn’t you use ThreadPoolExecutor and release the GIL?
@joliver1981
@joliver1981 3 жыл бұрын
Suggestion, please watch the videos before publishing to KZfaq to catch issues. The audio doesn’t match video and the code is way too small to read. Also thought I heard snoring lol. Love the channel but some of these issues I have witnessed before which makes me think you publish without watching your videos.
Three Nails in the Coffin of TensorFlow/Keras?
13:50
Jeff Heaton
Рет қаралды 8 М.
Как бесплатно замутить iphone 15 pro max
00:59
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 8 МЛН
Best Toilet Gadgets and #Hacks you must try!!💩💩
00:49
Poly Holy Yow
Рет қаралды 20 МЛН
The Trick to Get Unlimited Datasets
9:12
Rob Mulla
Рет қаралды 7 М.
Make Python code 1000x Faster with Numba
20:33
Jack of Some
Рет қаралды 441 М.
Scrapy for Beginners - A Complete How To Example Web Scraping Project
23:22
John Watson Rooney
Рет қаралды 264 М.
Coding Challenge #75: Wikipedia API
24:51
The Coding Train
Рет қаралды 316 М.
Autoencoders in Python with Tensorflow/Keras
49:39
sentdex
Рет қаралды 75 М.
CrowdStrike IT Outage Explained by a Windows Developer
13:40
Dave's Garage
Рет қаралды 2,1 МЛН
36C3 Wikipaka WG:  Infrastructure of Wikipedia
52:01
media.ccc.de
Рет қаралды 4,8 М.
Solving real world data science tasks with Python Pandas!
1:26:07
Keith Galli
Рет қаралды 1,5 МЛН
CUDA Explained - Why Deep Learning uses GPUs
13:33
deeplizard
Рет қаралды 232 М.
Сколько реально стоит ПК Величайшего?
0:37
تجربة أغرب توصيلة شحن ضد القطع تماما
0:56
صدام العزي
Рет қаралды 63 МЛН
iPhone 15 Pro Max vs IPhone Xs Max  troll face speed test
0:33
Rate This Smartphone Cooler Set-up ⭐
0:10
Shakeuptech
Рет қаралды 6 МЛН