Topic: Error: Using index.json won't look up post IDs

Posted under e621 Tools and Applications

Using the script that another user provided me, I've been using https://e621.net/post/index.json?tags=md5: to look up tags of posts by MD5sum. I tried https://e621.net/post/index.json?id=md5: to lookup the exact post ID of a picture by MD5sum, but it keeps returning the post ID of the most recent post... Is there something wrong with my url?

Updated by savageorange

Yes.

https://e621.net/post/index.json?tags=md5:

is correct; it means 'return the record(s) of the post(s) with these tags: md5:SOME_MD5_SUM'.
One of the fields of that record is 'id', another is 'tags'.

The script extracts the content of the tags field using jq.
If you want id, extract that instead.

Updated by anonymous

savageorange said:
Yes.

https://e621.net/post/index.json?tags=md5:

is correct; it means 'return the record(s) of the post(s) with these tags: md5:SOME_MD5_SUM'.
One of the fields of that record is 'id', another is 'tags'.

The script extracts the content of the tags field using jq.
If you want id, extract that instead.

Here's my modified code to accommodate the ID field:
file="$(zenity --file-selection)"; file="$1"; wget -O - 'https://e621.net/post/index.json?id=md5:'$(md5sum "$file" | cut -d ' ' -f 1) | jq --raw-output '.[0].id' - > /var/www/myimouto/public/e6id.txt

Here's the output of /var/www/myimouto/public/e6id.txt:
948072

That's not the right post ID... That was the ID of the most recent post at the time I ran the script.

I tried using this instead:
filename="$(zenity --file-selection)"; wget -O - 'https://e621.net/post/show.xml?md5='$(md5sum "$filename" | cut -d ' ' -f 1) > /var/www/myimouto/public/e6id.xml;

But the output includes a lot more details I have no use for, and I have no idea how to filter out all the details not related to post IDs.

Updated by anonymous

Eh?

file="$(zenity --file-selection)"; file="$1"

.. Here, you are asking zenity for a filename, assigning the filename it outputs to the variable 'file', and then immediately throwing away that filename (reassigning file="$1"). You still should have the same problem as before, according to that. So it's quite interesting that you are getting any result at all.

Updated by anonymous

savageorange said:
Eh?

file="$(zenity --file-selection)"; file="$1"

.. Here, you are asking zenity for a filename, assigning the filename it outputs to the variable 'file', and then immediately throwing away that filename (reassigning file="$1"). You still should have the same problem as before, according to that. So it's quite interesting that you are getting any result at all.

Huh? Oh, I thought I had made an exact copy of the code you helped me with... hmm... lemme try again.

Updated by anonymous

Okay, I copy and pasted your code word-for-word from the last thread, but it still isn't loading up the right post ID.

Either the index.json file doesn't have the capibilities to search for post IDs by md5, or perhaps there's an error with the way my computer is sending it's headers.

Here's the code I used:
filename="$(zenity --file-selection)"; wget -O - 'https://e621.net/post/index.json?id=md5:'$(md5sum "$filename" | cut -d ' ' -f 1) | jq --raw-output '.[0].id' - > /var/www/myimouto/public/e6tags.txt

Here's the output:
948095

Either way, the XML method I used earlier seems to work, and while it may swamp me with a lot of data, at least Firefox does a good job of keeping everything color-coded and organized.

Updated by anonymous

id=md5:

? That's the same problem as I highlighted in my first reply in this thread.

(I just double checked that I didn't suggest any url with 'id=' in it, FWIW, only 'tags=')

e621 is simply ignoring the id= field, as it doesn't know what to do with it. When you look at the raw data , you might realize it's simply returning the latest posts (since your query is effectively empty); which explains why taking the id value of the first item would return the newest post id.

A correct URI that uses 'tags=', does work (and I verified this by adjusting id= to tags= in the latest snippet you posted); which fits with your success with xml (in which you correctly use 'tags=').

Updated by anonymous

You're a bit confused about how the post/index url format works.

First of all, understand that tags is the only meaningful parameter you can pass to post/index to filter posts. (although you can use stuff like limit to limit the number of results)

There are two types of 'tags': first are the standard tags everyone knows and loves, the ones that show up next to a post, like female and lemur.

The second type is called 'metatags', and these are a way of searching posts using data about the post itself, such as the number of people who favorited it (favcount:4), id of the post (id:>500000), and the post's md5 checksum (md5:a3fb26...), but there are many others. These still go in the tags parameter of post/index.

Now, what you're confused about: you seem to think that when you do post/index?tags=md5:foo, you're attempting to retrieve the tags for a post with the given md5. What you're actually doing is searching for posts, and the tags contain md5:foo (which is a metatag, as explained above). As such, it'll return exactly one post (md5s are unique), and then you just read the tags attribute of that post, which is unrelated to the tags parameter you passed.

Trying to search using id instead would just return all of the latest posts as if you didn't provide any parameters at all, because id isn't a valid parameter to pass to post/index (to search by id, use post/index?tags=id:12345).

If you needed to get a particular post's id, then search like you are now, but instead of reading the returned post's tags attribute, read its id attribute.

Updated by anonymous

Also, you said you tried post/show instead of post/index. This method only works if you know the post's id or md5, but it's better than post/index for two reasons.

One, it's a little easier on our servers than searching, because searching uses slow, complex code, whereas grabbing a specific post out of the database is much easier and faster.

Two, the results of post/index are wrapped in an array (JSON) or a <posts> element (XML). It's exactly the same results as post/show, but it's one additional step you need to go through to get the post data. Usually the difference is just between post.id and posts[0].id (depending on the language), so it's not a big deal, but if you can use post/show, why not? There's no downside :)

Updated by anonymous

TonyLemur said:
Also, you said you tried post/show instead of post/index. This method only works if you know the post's id or md5, but it's better than post/index for two reasons.

One, it's a little easier on our servers than searching, because searching uses slow, complex code, whereas grabbing a specific post out of the database is much easier and faster.

Two, the results of post/index are wrapped in an array (JSON) or a <posts> element (XML). It's exactly the same results as post/show, but it's one additional step you need to go through to get the post data. Usually the difference is just between post.id and posts[0].id (depending on the language), so it's not a big deal, but if you can use post/show, why not? There's no downside :)

I would've responded back during my lunch break, but I had to cut it early to finish up some work.

After editing the code, I was finally able to get the script working! I used the slight modification shown below:
filename="$(zenity --file-selection)"; wget -O - 'https://e621.net/post/index.json?tags=md5:'$(md5sum "$filename" | cut -d ' ' -f 1) | jq --raw-output '.[0].id' - > /var/www/myimouto/public/e6tags.txt

After inputing /media/ha-gay/6F49B10D115AF445/Yiff/747069437bd189aec817744c8f4120de.jpg (which funnily enough had the md5sum in it's title lol), it rendered the following in /var/www/myimouto/public/e6tags.txt:
921840

So now that that is all sorted out, now I can create a PHP script that would show the ID of the post, tags, and source links.

Thanks for all your help, @TonyLemur and @savageorange!

Updated by anonymous

TonyLemur said:
Also, you said you tried post/show instead of post/index. This method only works if you know the post's id or md5, but it's better than post/index for two reasons.

One, it's a little easier on our servers than searching, because searching uses slow, complex code, whereas grabbing a specific post out of the database is much easier and faster.

Two, the results of post/index are wrapped in an array (JSON) or a <posts> element (XML). It's exactly the same results as post/show, but it's one additional step you need to go through to get the post data. Usually the difference is just between post.id and posts[0].id (depending on the language), so it's not a big deal, but if you can use post/show, why not? There's no downside :)

And since I'll be entering each picture in this script by hand, It wouldn't cause that much server load. Now, if I was to make an automated script to this, then I would see the concern, but I will respect any concerns of server load in any future scripts I make/adapt.

Also, is there some kind of API documentation for the index.json file? While looking through the API doc under ">>", I couldn't seem to find anything about it.

Updated by anonymous

Thanks Tony for explaining the things I didn't have the patience to (and pointing out that post/show can accept an md5, which I didn't notice.)

Nikolai-The-Wolfdog said:
Also, is there some kind of API documentation for the index.json file? While looking through the API doc under ">>", I couldn't seem to find anything about it.

All of the fields in records returned by post/index or post/show are documented under Posts -> Show. This is slightly out of date (IIRC the fav_count is not currently returned, for example), but any fields that are actually present should behave as described.

Updated by anonymous

  • 1