Building a application for some experimentation and releasing it if it works out, I intend to do what many have talked about before. A content recommendation system, that works on both Tags and other user Favourites, as well as what the user has voted on to predict and suggest 'good' content. But I must admit most do not know the correct algorithms or statistics to apply whilst I do.
However to download this data I've started using the API which is useful and good, but to download all of e621 it is not practical for me, and I suspect not practical for the server. My current script may take over 160 hours to download all the data, whilst I estimate the real database is only ~200mB (For metadata only so not the actual posts no images or flashes or whatever)
So I was just wondering if the admins at e621 could make a raw database download (in whatever form) that contains information similar to that available from the API and then just upload it to a fileshare service.
For me I know I need a Posts table with the PostID, PostURL, PostTags, PostsFavoritedByUsers, ect.. Maybe Sets and Pools data too (If possible)
Also if anyone is actively scrapping the e621 database and already has a near up to date copy with the info I requested, if you could maybe share you're databases instead that would work equally well.
Thanks I hope that someone can help me with this, and I hope to release the content recommendation system once I've got it set up.
Updated by mrox