Scrapping malaysianpaygap & Extracting data from the Instagram posts

Recently @malaysianpaygap has gotten quite famous as a platform that enables workers throughout Malaysia to anonymously share their salaries amongst other Malaysians. Its a great initiative and I am fully supportive behind ensuring that Malaysians are not taken advantage of by companies and get a liveable wage(especially when inflation is sky high). NOTE: If you just want the data then you can download the zipped folder from here. How to run Run the following to get conda environment setup conda […]

Read more

A scrapy project for crawl IPTV playlist

a scrapy project for crawl IPTV playlist. Dependency Python3 pip install scrapy Usage Output Output playlist file is playlist.m3u. You should note that this file will be overwritten every time when you run spider. Customize You can customer the filter condition. Just edit the start_urls in ejatv.py Example: this url https://eja.tv/?limit=0&country=js&language=Chinese&category=&level=0&search= means channel from Japan, language is Chinese, and any category Avaliable parameters value are follow: Category

Read more

Download NCERT books using scrapy

download NCERT books using scrapy How to use Initial Setup git clone https://github.com/nit-in/download_ncert_books.git cd download_ncert_books pip install -r requirements.txt to run the spider scrapy crawl –nolog ncert and follow the prompts for example if you want to download Class 11th Economics Book scrapy crawl –nolog ncert ─╯ Enter the class: 11 Select one the subjects: Enter 1 for Sanskrit Enter 2 for Accountancy Enter 3 for Chemistry Enter 4 for Mathematics Enter 5 for Economics Enter 6 for Psychology Enter […]

Read more

A repository with scraping code and soccer dataset from understat.com

UNDERSTAT – SHOTS DATASET As many people interested in soccer analytics know, Understat is an amazing source of information. They provide Expected Goals (xG) stats for every shot taken in the top 5 leagues in Europe, as well as the Russian league. After watching an awesome tutorial by McKay Johns (great channel btw, loads of resources for beginners in soccer analytics), I decided to write some code to scrape all the shots data available at Understat. As a consequence I […]

Read more