Commit Graph

39 Commits

Author SHA1 Message Date
Carlos Sousa
29881deca5 Updated: README.md 2021-09-01 20:56:57 +02:00
Carlos Sousa
d655091a36 Hotfix: Dependencies were missing in functions.py 2021-08-06 13:15:59 +02:00
Carlos Sousa
18e150e71a
Fix #4
Added: Remove Duplicates
Fixes #4
2021-08-06 01:22:44 +02:00
Carlos Sousa
feb97c7039 Added: Remove Duplicates 2021-08-06 01:18:57 +02:00
Carlos Sousa
e7f144a438 Updated: Improved ErrorHandling 2021-07-29 20:36:03 +02:00
Carlos Sousa
5d1dcfcccb Updated: PoC database.sql 2021-07-29 20:35:45 +02:00
Carlos Sousa
69a8f85ae3 Updated: Better Error Handling 2021-07-29 20:13:48 +02:00
Carlos Sousa
6419ba97a7 Updated: MariaDB as Back Completed 2021-07-29 17:42:13 +02:00
Carlos Sousa
96e29fd448 Changed: Folder Structure 2021-07-28 19:52:15 +02:00
Carlos Sousa
74d623d39e Added [ToDo:] 2021-05-04 00:06:23 +02:00
Carlos Sousa
f59931efd9 Added ToDo 2021-05-04 00:05:11 +02:00
Carlos Sousa
4c437ade07 Merge branch 'main' of https://github.com/zebrajr/imdbscrapper into main 2021-05-03 17:54:24 +02:00
Carlos Sousa
51ab8659fb Fixed Description - ";" replaced 2021-05-03 17:54:21 +02:00
Carlos Sousa
aa6f8e0f8d Fixed Description - ";" replaced 2021-05-03 17:44:46 +02:00
Carlos Sousa
05dfb1e15b Added link to scrap data repo 2021-05-03 17:06:35 +02:00
Carlos Sousa
ebd422d10c Changed limit on currentEndURL from 0 to endURL 2021-05-03 16:07:22 +02:00
Carlos Sousa
301de775a8 Added env variables for startURL, endURL, steUpCycle, nrProcesses to docker 2021-05-03 15:01:27 +02:00
Carlos Sousa
8d6a3aec8a
Merge pull request #1 from zebrajr/devProcesses
Migrated from single to parallel processes
2021-05-03 14:45:15 +02:00
Carlos Sousa
ecb7da2939 Migrated from single to parallel processes 2021-05-03 14:43:08 +02:00
Carlos Sousa
4999912989 Trying I/O improvements 2021-05-03 02:49:25 +02:00
Carlos Sousa
f5cb768a65 Started multithreadding POC 2021-05-03 02:06:12 +02:00
Carlos Sousa
b1984e1fdf 10000000 - 9956224 2021-05-02 21:54:14 +02:00
Carlos Sousa
69c3632d05 Added ToDo 2021-05-02 18:38:56 +02:00
Carlos Sousa
32f2d04397 Added reCheck file logic for better performance on reChecks. Changed check to descending order. 2021-05-02 18:36:17 +02:00
Carlos Sousa
7613816d1a Added Total Rating Count 2021-05-02 16:25:02 +02:00
Carlos Sousa
641d5faf4c Added ToDo 2021-05-02 14:56:18 +02:00
Carlos Sousa
00b9cc19cf First 1640 entries indexed 2021-05-02 14:38:14 +02:00
Carlos Sousa
8511975591 Fixed grammatic 2021-05-02 14:35:54 +02:00
Carlos Sousa
e8034d23ac Updated Task. Updated Action. 2021-05-02 14:34:54 +02:00
Carlos Sousa
bacdbcf75f Fixed info on installing requirements 2021-05-02 14:26:43 +02:00
Carlos Sousa
3bf7895f32 Removed invalid options 2021-05-02 14:25:40 +02:00
Carlos Sousa
894b5c7f0c README.md updated 2021-05-02 14:23:12 +02:00
Carlos Sousa
d7548ba624 Added better error handling. Added continue from last value 2021-05-02 14:10:12 +02:00
Carlos Sousa
af7fe142df Main Logic completed. 2021-05-02 13:44:34 +02:00
Carlos Sousa
7f47c10346 Main logic done.
Will write movies to movies.csv and shows to series.csv
2021-05-01 23:52:37 +02:00
Carlos Sousa
c224144cd5 Added running as non-root user 2021-05-01 23:52:01 +02:00
Carlos Sousa
54289759ea Added directory for storage 2021-05-01 23:51:38 +02:00
Carlos Sousa
eaf30aa206 Added basic file layout, dockerfile and docker-compose.yml 2021-04-30 13:16:40 +02:00
Carlos Sousa
cfe95825bc
Initial commit 2021-04-30 13:11:32 +02:00