Topic: What is the History of our Highest Rated Post?

Posted under General

In other words, what's our FMA:B?

Currently it's Airlock Lust for a strong 2 years, before that i still remember zonk's last animation, the Renamon x Guilmon being number 1. But Umbreedon has grown rapidly, likely to dethrone the 2nd place post. And maybe Airlock Lust eventually.

What i'm curious of is how often does the number 1 post changes and what the history of it is. I'm calling all historians and vets on this site with a good memory.

also, if there is someway to see the top posts of the past (aside from wayback machine) that obviously would help too.

for as long as I can remember the first result for order:favcount and order:score have always been one of ZonkPunch's animations. I don't think he even got dethroned during his extended hiatus.

EDIT: you can probably just search order:score 20xx and get a pretty good estimate which posts were the most popular for each year

Updated

darryus said:
EDIT: you can probably just search order:score 20xx and get a pretty good estimate which posts were the most popular for each year

that's just for the year of release, I'm talking what held the number 1 spot through out the year.

benjiboyo said:
that's just for the year of release, I'm talking what held the number 1 spot through out the year.

well, post upvotes and favs drop off pretty quick after the initial upload date, so most of the likes they have now isn't going to be too disimilar to their count within the initial year of upload. so, you should be able to just compare the top few of each year and to have a fairly accurate list of the top post of all time for each year.

I think that the only exception to this would be stuff from really early in the site's life. posts from 2007 are going to be the very last page of some searches are going to continue to gain a constant trickle of likes and faves pretty much forever, so their current counts are going to be pretty far off their initial counts.

kora_viridian said:
All that.
At least if you live in the United States.

Yeah, sorry all this is just going over my head. But thanks for the very... Detailed solution.

Also, if you couldn't tell from my activity time (being close to JST) i'm not from the US

kora_viridian said:
You could write a script to hit the API once a day, ask for the data on the top 100 or 200 posts, and save that to a file. Then, later, you go back through those files and chart which post was #1 (or which ones were #1-#10, etc) over time. That only works starting the day you write the script, though - it doesn't tell you much about the past.

If e621 keeps more than four days' worth of the database dumps somewhere, and if the admins can provide them to you, you could answer your question with that information. Two points:

1) I'm not sure if the database dumps go back further than spring 2020, when the current version of e621 went online.

2) If the database dumps do go back to 2007, it'll be a lot of data. The posts file is currently a little over 1 GB a day, which means a year's worth of data will be approximately 0.36 TB. The limit condition is that volume of data for 16 years, or something like 5.8 TB. Of course, the posts file wouldn't have been nearly that big in the earlier years of the site; my guesstimate is that the total size of all the database dumps would be something like 50% of that number, or about 2.9 TB.

This is enough data that downloading it becomes an issue. If you have a 1 gigabit connection with unlimited transfer, it would take you a little under 3 days to download the past years' worth of database dumps (mid-2022 to mid-2023). If my 2.9 TB guesstimate for the whole thing is accurate, that's around 23 days to download it all. If you have a slower connection or data caps, it would take longer. This is getting into the territory where it would be faster to buy a new hard drive or SSD, mail it to e621, have them copy all the dump files to it, and mail it back to you. At least if you live in the United States.

Torrents are an extremely efficient way of distributing large data sets. Plus you can just throttle it to whatever your connection can handle.

In the database, every vote (not favorites) is stored with the date it was created, so it would actually be very easy to calculate this information. The problem is this information is only visible to staff.

I do remember requesting for anonymized access to it quite a few years back (just the time the vote was created and whether it was positive or negative) but it was rejected with reasoning something along the lines of "It's not something that users need to know."

  • 1