Topic: How do I use the API?

Posted under e621 Tools and Applications

I've recently gotten into learning Python as a hobby, and for my first big project I wanted to build an image scraper fore621, but this is my first time doing any project that connects to the internet so I'm not sure how to code for it, any help is appreciated

Without any specific questions people aren't really going to be able to provide much help. You can see the official API documentation here and as for using Python I'm not too sure, just google "How to interact with a REST API with Python" or something.

akimfur said:
I've recently gotten into learning Python as a hobby, and for my first big project I wanted to build an image scraper fore621, but this is my first time doing any project that connects to the internet so I'm not sure how to code for it, any help is appreciated

You'll want to use the third-party requests library (there's also a built-in one, but it's quite low-level) to perform API requests to the e621 API. The API is documented here.
Essentially, what you are doing is you are loading special websites on e621 that output JSON text that you can process using your Python script. For example—and you can do this in your browser; simply put the link in your address bar—by sending a GET request to https://e621.net/posts.json, you get a listing of recent posts in the JSON format (so the same thing as on the regular https://e621.net/posts page, but machine-readable using JSON).

Do note that you need to follow rules regarding accessing the e621 API (rules are listed in the API documentation ), like sending a non-generic user-agent header (see here ) with your requests. You also need to follow the rate limit; i.e. not sending too many requests too fast (you can use the third-party ratelimit Python package to easily add an automatic cooldown to your script).

It might seem daunting when I just spit out a bunch of technical mumbo-jumbo right now if you are a beginner, but in the end, it's quite simple, and I think this can be a good beginner's project for learning about dealing with web APIs.

Good luck; have fun!

Updated

There are people saying that Python will be slow, but for this application, it DOES NOT MATTER. If you're pulling that many images down, then you need to slow yourself down, anyways.

Any suggestions for User Agent strings? Like, name of your program and version at the minimum? Features your program understands? Spider/bot versus just a more sophisticated UI? (i.e. Wolf's Stash is not really a bot in the normal sense, IMO)

Yeah, parsing JSON on Python with your own library (or worse, hand-coded one time in place) is a non-starter. You'll just end up with more buggy/less effective code that runs with about the same resources and difficulty to use. If you're wanting to do stuff like that to save battery life or whatever on a portable device, look into compiled programs. After all, that's what Android's Dev Kit is for! ;)

As an aside, if you use XML/JSON/... on a program, you can compress it with ZIP (Deflate) or GZ, like FreeCol and other open-source games do. The JSON from this site is so small that it makes no sense.

kora_viridian said:
Apparently, DText doesn't insert the <a name="foo"> anchors it would take to make that work.

People no longer remember the things we developed to deal with the case when long web page was long.

To be fair, it is right at the top of the article. :D Not the kind of 'visual search' I bet most companies like Google are thinking of!

  • 1