Topic: Another e621 pool downloader

Posted under e621 Tools and Applications

Here I come to share with everyone another pool downloader :-)

Now, I know that there are others, but I kinda felt like making my own version.(I had a lot of free time)

Anyway, if anyone wants to try it here it is
>> LINK <<

It's a Java application, so you'll need JRE.

Enjoy!

Updated

Do you have plans for releasing the source code for this application in the future?

I don't really like executing unknown java applications downloaded from the internet on a windows machine for security reasons. Having the option to look into the sources to see what the program does is what I welcome the most, so I can be sure it won't do anything harmful.

Anyway, I see you used jsoup and since I used that too in one of my projects I have the feeling that your program won't do anything harmful :)

I've tested it out and it's working great! The only thing I noticed is that it causes a very high CPU usage, you might have to optimize it a bit. But overall, well done ;)

Updated by anonymous

source code

The CPU usage is interesting... It doesn't use much when I'm home.
But I got curious since you mentioned it, so I tested with a better internet connection this weekend and you're right! When the connection is good enough it causes a high CPU usage. I didn't notice before because my connection isn't very good at home.

However when the internet is good it also finishes way faster, so from this point of view it doesn't make much of a difference.

I intend to make it work with others sites in the future, so if I have a better idea for the code while I'm at it I'll make it work better :)

Updated by anonymous

I looked at the code, it seems to be parsing the HTML instead of using a endpoint, this could be the cause of the CPU usage issue.
If you are interested in using the API, we have a documentation here: e621:api

Additionally, I noticed there does not appear to be any rate limiting(It just fetches every page as fast as possible). While this may be ok with small pools, downloading of big pools can result in getting timed out by the server.
There also does not appear to be any User-Agent specified, I highly recommend specifying a user-agent as stock user-agents usually get blacklisted.

Updated by anonymous

Chaser said:
I looked at the code, it seems to be parsing the HTML instead of using a endpoint, this could be the cause of the CPU usage issue.
If you are interested in using the API, we have a documentation here: e621:api

Additionally, I noticed there does not appear to be any rate limiting(It just fetches every page as fast as possible). While this may be ok with small pools, downloading of big pools can result in getting timed out by the server.
There also does not appear to be any User-Agent specified, I highly recommend specifying a user-agent as stock user-agents usually get blacklisted.

I do need to learn about programming web applications, actually this is the first application I made that requires internet, so it propably has many mistakes.

Thanks for the tips :)
I'll look into them when I get back to this.

EDIT (24/01/2018): I made some changes to the code, now I'm using the api so it makes a lot less requests, I specified the User-Agent and I made sure there'll be more than 1 second between requests.

Updated by anonymous

  • 1