Topic: Post suggestion API

Posted under e621 Tools and Applications

I'm working on building a post suggestion API, which could be integrated into a web browser as an extension to add suggested posts to e621.net.

It'll feature an option for the original post suggestion algorithm (implemented in material e621), as well as one I have come up with (involving dynamic indexes to identify trends in people that like similar content and using that for predictive suggestions).

As it's still in early development, it's not fully working yet. However, I wanted to post about it to "get the word out" so to speak, and figure out what features people wanted to see so I can plan for them (if/when possible).

At this time, I'm still scraping ~900000 users' favorite lists, after that, I have to scrap users and then posts. This dataset, when complete, will not only allow me to train what I hope to be a fairly accurate model for suggesting posts but could also be used to determine certain stats such as tag popularity over time. (This step, due to rate limits, will take about 10 days or less for the users and favorites, and about 15 days for the posts).

Update on dataset fetch time: I've been able to decrease the time from 10 days for users & favorites to ~2.5 days for both.

I intend to make the dataset publically accessible.

I'll be posting updates in this thread.

Updated

There is no need to scrape posts/tags, https://e621.net/db_export/ exists. I still remember topic #23038 which I used a bit, it basically recommended you users with similar favorites to yours. It's not working anymore unfortunately.

The other person claimed it would take months to fetch all the favorites, did you maybe miscalculate something (ignore if you already started and are almost finished)? It looks like Kira offered the dataset to the other person, maybe you could also just ask.

earlopain said:
There is no need to scrape posts/tags, https://e621.net/db_export/ exists. I still remember topic #23038 which I used a bit, it basically recommended you users with similar favorites to yours. It's not working anymore unfortunately.

The other person claimed it would take months to fetch all the favorites, did you maybe miscalculate something (ignore if you already started and are almost finished)? It looks like Kira offered the dataset to the other person, maybe you could also just ask.

Yeah, I've already started fetching favorites and have been doing so for ~10 hours, my calculations are based on the current speed avg which shouldn't change as I'm only limited by e621s' API rate limit (one request every 500ms).

Hey, thanks for the tip about https://e621.net/db_export/. That makes my life much, much easier.

I'll reach out to KiraNoot and ask. It would be nice to have a favorites dataset to start with, then I can use my current method to continuously update it.

  • 1