Scrapper to get movies and shows information from IMDB.
Go to file
2021-09-01 20:56:57 +02:00
src Hotfix: Dependencies were missing in functions.py 2021-08-06 13:15:59 +02:00
docker-compose.yml Added: Remove Duplicates 2021-08-06 01:18:57 +02:00
Dockerfile Changed: Folder Structure 2021-07-28 19:52:15 +02:00
LICENSE Initial commit 2021-04-30 13:11:32 +02:00
README.md Updated: README.md 2021-09-01 20:56:57 +02:00
requirements.txt Changed: Folder Structure 2021-07-28 19:52:15 +02:00

imdbscrapper

Scrapper to get movies information from IMDB, indexing it into movies and shows, with rating, release date, and a few more information.

Situation

Finding movies / shows to watch, based on ratings and release date. This search and notes would have to be done manually.

Task

Create a way to automatically index entries from movies (IMDB), so they can be searched and filtered afterwards via common software (Spreadsheet)

Action

docker build -t yourUser/yourPackage:yourVersion .
  • Directly

Install the requirements described in requirements.txt (pip3 install -r requirements.txt) Create the folder structure or edit the settings in the main script

python3 scrapper.yml

Result

File Content
movies.csv CSV file with all movies indexed
series.csv CSV file with all shows indexed
info.log Any errors occured. Change the debug level if you want to log info messages
counter.txt The last indexed url. Needed to continue in case the script is interrupted

Note

ToDo

  • Dont input duplicates into dataTable
  • Add Error Handling in case Internet is not available
  • Add possibility to re-index failed entries (to go though the indexer faster when a new movie/show is added)
  • Add Multithreading

Ps.: Feel free to improve :)

Some Statistics