Scrapper to get movies and shows information from IMDB.
Go to file
2021-05-03 14:43:08 +02:00
src Migrated from single to parallel processes 2021-05-03 14:43:08 +02:00
docker-compose.yml Removed invalid options 2021-05-02 14:25:40 +02:00
Dockerfile Added directory for storage 2021-05-01 23:51:38 +02:00
LICENSE Initial commit 2021-04-30 13:11:32 +02:00
README.md Added ToDo 2021-05-02 18:38:56 +02:00
requirements.txt Added basic file layout, dockerfile and docker-compose.yml 2021-04-30 13:16:40 +02:00

imdbscrapper

Scrapper to get movies information from IMDB, indexing it into movies and shows, with rating, release date, and a few more information.

Situation

Finding movies / shows to watch, based on ratings and release date. This search and notes would have to be done manually.

Task

Create a way to automatically index entries from movies (IMDB), so they can be searched and filtered afterwards via common software (Spreadsheet)

Action

  • With Docker
docker build -t yourUser/yourPackage:yourVersion .
  • Directly

Install the requirements described in requirements.txt (pip3 install -r requirements.txt) Create the folder structure or edit the settings in the main script

python3 scrapper.yml

Result

File Content
movies.csv CSV file with all movies indexed
series.csv CSV file with all shows indexed
info.log Any errors occured. Change the debug level if you want to log info messages
counter.txt The last indexed url. Needed to continue in case the script is interrupted

Note

ToDo

  • Add Error Handling in case Internet is not available
  • Add possibility to re-index failed entries (to go though the indexer faster when a new movie/show is added)
  • Add Multithreading