Topic: I made a thing to suggest posts you may like

Posted under e621 Tools and Applications

This topic has been locked.

Still just an early concept right now, but I hope to develop it further.

http://zoranu.ddns.net

Unlike some of the other tools I've seen here, it doesn't ask you to rate posts before it can suggest anything. Instead you enter your username and it suggests posts based on your favorites. It's also web based, so there's nothing to download and it's platform-independent.

I'd like to hear your feedback on how well it works and any improvements I could make.

Updated by xxx yaboi xxx

Let's see how this tool holds up in the face of my massive 97,402 favorites list.

Edit: it took about a minute for method 1 and suggested me 61 images (actually 60), which I will analyze below.

Method 1

0 - The very first thumbnail has '0' above it and a broken thumbnail. It links to a non-existent post
post #1154794 - I don't typically fave dickgirl images so the first one is already wrong
post #1153715 - sure
post #1153011 - already favorited (zp92)
post #1153010 - already favorited
post #1151433 - a little weird but hey whatever. approved (literally)
post #1149551 - already favorited. standard ream-a-renamon
post #1147108 - dickgirl
post #1146183 - dickgirl
post #1145813 - already favorited (tsampikos gets all the favs)
post #1145677 - dickgirl
post #1144221 - lion king images weird me out but I'll add a favorite
post #1144219 - as above
post #1142166 - nice traditionally drawn
post #1132461 - good
post #1130529 - dickgirl but I'll fav it anyway
post #1126689 - more girls with dicks
post #1125653 - I'll fav the definitely not an amputee
post #1123301 - already favorited
post #1122740 - D I C
post #1120611 - already favorited
post #1117858 - K G I
post #1116819 - few herms as well in my favs. middle of a comic isn't ideal either but can be excused
post #1115512 - sure thing
post #1112879 - already favorited
post #1112683 - R R L (dickgirl/dickgirl this time)
post #1112122 - another dickgirl/dickgirl. I favorite many lygerside images and probably lycanroc
post #1111812 - already favorited
post #1111139 - sure
post #1111134 - sure (same comic as above)
post #1104571 - sure
post #1104567 - sure
post #1104562 - sure
post #1102205 - already favorited
post #1100778 - sure
post #1099417 - already favorited
post #1097727 - sure
post #1095597 - dickgirl
post #1086199 - dickgirl and vore
post #1085484 - dickgirl
post #1085215 - dickgirl (looks good though)
post #1085214 - as above
post #1084023 - sure
post #1083236 - already favorited
post #1081762 - I don't fave many videos. And it's a dickgirl
post #1080103 - sure
post #1080101 - sure
post #1079863 - 2x dickgirl
post #1077994 - dickgirl
post #1077984 - as above
post #1156969 - sure. and overlooked. Sudden jump in post id from 1077k to 1156k
post #1156962 - gaaayyyyy
post #1156949 - sure
post #1156881 - already favorited
post #1156880 - good/great. as an mlp fan character, it is less well represented in my favorites list
post #1156879 - already favorited
post #1156786 - good/great
post #1156738 - sure
post #1156710 - already favorited
post #1156700 - weird, but I'll fav and approve
post #1156689 - sure

Analysis analysis #1: I didn't ask for all these dickgirls (not a blacklist issue, I just have animated type:gif and bvats on there). post #1156880 and post #1156786 were the stand out winners, and they were both in the second group after the post ID jumped back up. Images the user has already favorited should be removed from the array. That should take a fraction of the roughly 60 seconds it took to make the list.

Now my favorites list is at 97,430. Query took around 203 seconds and returned 74 images.

Method 2

(no empty thumbnail this time)
post #1157539 - sure thing
post #1157518 - sure (male/ambiguous)
post #1157473 - ok
post #1157336 - basically gaaay
post #1157295 - already favorited
post #1157283 - already favorited
post #1157216 - already favorited. added upvote since this is a higher resolution version of an old image
post #1157203 - already favorited
post #1157196 - sure. same artist as above but missed
post #1157183 - sure (flash I tend to ignore)
post #1157162 - sure
post #1157141 - ok
post #1157120 - gaayyyyyy space odyssey
post #1157113 - over 50 MB so I ignored it
post #1157017 - already favd
post #1156662 - already favd
post #1156646 - teh gay
post #1156621 - fav'd parent but forgot this one
post #1156503 - ghey
post #1156469 - already favd
post #1156452 - good (others disagree)
post #1156402 - ok (middle of a comic)
post #1156396 - ok
post #1156373 - already favd
post #1156366 - dickgrrl
post #1156333 - good. I won't fav this yet because it's the... 3rd?... redraw of the comic
post #1156243 - gay
post #1156215 - good
post #1156214 - as above
post #1156147 - gey sanic lol
post #1156125 - already fav'd.
post #1156102 - gayliens
post #1156092 - gay (added eyes_closed tag)
post #1155970 - alrady favvvvdddd
post #1155934 - good
post #1155849 - already favd
post #1155838 - gay. pretty cute though
post #1155828 - already fav'd despite it being palcomix
post #1155827 - dickgirls
post #1155815 - dickgirls
post #1155809 - gay
post #1155799 - sure. skipped marsminer lately
post #1155768 - what in the hell is that? lolfav'd.
post #1155654 - dickgirl. possible mistag.
post #1155636 - sure
post #1155553 - sure
post #1155551 - gaaaaay.. I mean male/male
post #1155539 - sure. should be pooled with post #1155553.
post #1155503 - already favd
post #1155498 - already favd
post #1155492 - d*ckg*rl but I'll fav it
post #1155483 - already favd
post #1155325 - herm
post #1155303 - m/m
post #1155266 - male on male
post #1155264 - sure
post #1155242 - big gay
post #1155240 - as above and the tattoo is in a weird place
post #1155138 - gay. but at least it's decidueye
post #1155127 - good
post #1155073 - gay
post #1155034 - good
post #1155021 - good
post #1154952 - sure
post #1154948 - sure
post #1154945 - sure
post #1154901 - sure
post #1154885 - sure
post #1154872 - sure (this streak brought to you by yogoat)
post #1154870 - dickgirl
post #1154794 - dickgirl, 1st image from Method #1
post #1154245 - ok
post #1153829 - already fav'd, but added upvote
post #1153806 - already faved

Analysis analysis #2: Many more gays this time. Found a neat artist, yogoat. Over 3 times the loading time, although there could be other explanations like higher traffic since you posted the forum thread.

Favorites list at 97,459. Query took only around 39 seconds and returned 75 images.

Method 3

post #1157269 - already
post #1157216 - already (came up in method 2)
post #1157118 - already
post #1157008 - already
post #1157004 - already
post #1156570 - already
post #1156562 - ok
post #1156333 - came up in method 2
post #1156107 - sure
post #1154244 - ok
post #1153560 - already
post #1153383 - sure (comic)
post #1152886 - sure
post #1152331 - already
post #1152142 - ok (hyper)
post #1151763 - already lol'd
post #1151526 - sure
post #1151517 - already
post #1150270 - sure
post #1150139 - sure (I favorite lonbluewolf hard)
post #1150083 - sure
post #1149903 - already
post #1149831 - already
post #1149829 - already
post #1149487 - already
post #1149323 - great
post #1149295 - good (video)
post #1149128 - good (long comic)
post #1149090 - already
post #1148934 - sure (no artist tagged)
post #1148817 - ok
post #1148463 - good. forgot to favorite
post #1148418 - great. In fact, I'll ultra-fav it. That bodes well for Method #3.
post #1147712 - sure
post #1147596 - sure (comic)
post #1147433 - good
post #1147334 - good (that one comic)
post #1147210 - ok (comic)
post #1146445 - sure
post #1146390 - already had it. great pic too.
post #1146218 - already
post #1146182 - ok (comic)
post #1146063 - good (comic)
post #1146061 - as above
post #1145572 - sure
post #1145570 - as above
post #1145559 - good (comic)
post #1145180 - sure
post #1144620 - sure (very long comic)
post #1144456 - sure (end of comic)
post #1144009 - already
post #1143742 - ok
post #1143460 - Good. This is an interactive flash game with many tags on it.
post #1143389 - good
post #1142648 - already
post #1141931 - sure
post #1141774 - already/great
post #1141437 - sure
post #1141418 - sure
post #1141094 - already. added upvote
post #1140562 - sure
post #1140561 - sure
post #1140558 - sure
post #1140557 - sure
post #1140556 - already/good
post #1140555 - sure
post #1140552 - sure
post #1140550 - sure. 7 of the last 8 were eevee swaps.
post #1139937 - herm
post #1139530 - sure
post #1139506 - good
post #1139445 - sure
post #1139385 - good
post #1139021 - sure
post #1138890 - good due to the fire

Analysis analysis #3: This method did much better than the first 2 and did not spit out many intersex or m/m. I also wonder if it gave higher weight to images with higher scores.

Final thoughts

Previously favorited images should be ignored. It makes sense and should not be hard to compute.

Although the blacklist can't be addressed yet, maybe never, less of the types of images users blacklist should show up just because they don't tend to favorite them. If they fave a very small number of m/m or m/f images, they shouldn't show up as much. This appeared to work as expected with the experimental mode. Rating should also play a role. I'm not sure if it does that now. If someone who usually favorites rating:safe uses the tool, it should return mostly rating:s images.

Middle of a pool (comic or sequence) is an interesting case. I don't want to suggest ignoring images from the middle of a pool, and not all pools have images that directly relate to each other (there are pokedex pools). inpool:true comic might help here. An advanced option could replace any suggested image from the middle with the first image of the pool, but only if it has the comic tag. The eevee color swaps are an interesting case. They are in a pool, and all but one of them appeared in the 75 suggestions (method 3).

Will users reuse this tool often? It seems to return images from the last 2 weeks and many from the past day, so will it be fresh results after 2 weeks? Is it expensive to search for older posts?

I think the ultimate tool could be an artist suggestion tool, possibly using a web of similarity based on tags shared by the artists' images or many users favoriting images from the same 2 artists or multiple artists in clusters of artists who are similar. There are many approaches.

After "I've found X images I think you may like." you should add "It took Y.YY seconds." That way users don't need to clock it to give you feedback on that.

Updated by anonymous

Genjar

Former Staff

Pretty good. Didn't find many new favorites for me, but that's probably because I've already seen just about every post that's here.

Method one worked the best for me, while the third was full of misses.

Here's my least accurate matches from method 1:
post #375810 post #1082342 post #1081762 and especially post #1077192. No idea why it'd recommend that last one to me.

Updated by anonymous

Method 1 gave me pretty much nothing but female human on male feral animal stuff. Which, except in rare instances, I often very strongly dislike. This was not one of those rare instances - the pictures it suggested to me were by and large the sorts of things that genuinely make me unhappy to look at.

Method 2 was somewhat better, though it also returned quite a bit of material that I have blacklisted. Lots of cub art, for some reason, and then just a bunch of kind of generic furry porn which wasn't as viscerally off-putting as the stuff the first method returned, but also wasn't very interesting to me.

Method 3 was much better. Actually gave me stuff I like, and a fair amount of it - though most of it was stuff I'd seen before. Also it leaned very heavily towards solo pictures, whereas the other two leaned very heavily towards duo or group images.

Updated by anonymous

Eugh, Method 1 gave me practically nothing except for stuff that I can't stand to look at. Cub, group sex, so on.

Clawdragons said:
Method 1 gave me pretty much nothing but female human on male feral animal stuff. Which, except in rare instances, I often very strongly dislike. This was not one of those rare instances - the pictures it suggested to me were by and large the sorts of things that genuinely make me unhappy to look at.

Basically this.

Updated by anonymous

i don't have many in my favorites, but lets see.

1.
There were a few interesting pics.

2.
Much better.

3.
Also good.

Updated by anonymous

Out of 75 pictures, only two were good, but one of them were already in my favorites.

Updated by anonymous

I skipped option 1 and 2 because people were saying 3 gives the best posts.

I was not disappointed.

Updated by anonymous

Accuracy:
  • Method 1: ~ 87%
  • Method 2: ~ 59%
  • Method 3: ~ 75%

Note: accurate = aceptable (most), good (~ 40) or excellent (5, 3 favorited previously).

Obs: a thing that bothered me was the lack of posts in the following categories:

pokémon (~ 14% of my favorites).

rating:safe (~ 18% of my favorites).

-rating:safe -masturbation -sex (~ 35% of my favorites).

Updated by anonymous

Genjar

Former Staff

Now that I had time to look closer at the results, here's my stats:

Method Accuracy
1 ~85%
2 ~15%
3 ~10%

There's barely any interspecies or size_difference among 2 or 3, even though those are common among my favorites. Way too many mammals, nowhere near enough scalies or avians.

Updated by anonymous

Method Accuracy
1 ~10%
2 ~75%
3 ~70%

I had more like.

Updated by anonymous

Using my data from earlier:

Method Images Accuracy #1 Accuracy #2 Accuracy #3 New Ultra Favorites Added
1 60 ~67% (40/60) ~43% (26/60) ~57% (26/46) 0
2 74 ~66% (49/74) ~42% (31/74) ~55% (31/56) 0
3 75 ~99% (74/75) ~68% (51/75) ~98% (51/52) +1

Accuracy #1: anything with "ok" or better gets in.
Accuracy #2: counts "already favorited" as misses.
Accuracy #3: removes "already favorited" from the total.

I guess the reason my results are so different from @Genjar is that I have 193 times more favorites and I'll favorite nearly anything with a female in it. This huge sample size and simplicity makes it very easy for Method 3 to work.

Genjar's favorites list is Earth and mine is two Saturns.

Updated by anonymous

I got nothing out of all three, I guess option three was close mostly ponies which I've already looked at but there seemed to be a lot of humanized ponies and giant boobs neither of which I like.

Updated by anonymous

Accuracy 1: 0% (no results)
Accuracy 2: 0% (no results)
Accuracy 3: 0% (no results)

Not sure what is going on there.. not like I'm lacking in favs.

I tried entering '8673' (user id) rather than 'savageorange' (user name), but neither returned any results other than the so-called "post #0"

I also tested in both Firefox (51.0.1) and Midori (0.5.11), in case one of my addons could have been interfering. Same results in both.

Updated by anonymous

<------------------------>
Hit 1: 39/75
Miss 1: 36/75
<------------------------>
Hit 2: 21/75
Miss 2: 54/75
<------------------------>
Hit 3: 24/75
Miss 3: 51/75
<------------------------>

Raw Data (multiplied by 3 for 75) (x - hit, o - miss)
1: xxoooxoxoxooxxoxoooxxxoxx
2: oooxooxooxxooooooxooxooxo
3: ooxxooxooxooxoxooxoooooox

Updated by anonymous

Typed in my username, all I got was a blank gray screen. :/

Edit: Working now.

Updated by anonymous

There's a new mode. I was going to do an in-depth analysis but then I got bored.

The new mode was better at picking things that I hadn't seen before that I liked, but at the same time it gave me a hell of a lot of dickgirls, which is odd because I have very many of those saved, and I find them very unappealing (the ones I do have saved are incidental).

Updated by anonymous

New mode actually works for me, so that's something.

Thumbnails are broken (of course; since they reference e621 servers, only thumbnails that are already cached will show up)

Hit rate is still 0% (it suggested ~78, none of them were of particular interest. One or two got upvotes for art quality)

Updated by anonymous

I decided to do a test with post #1129063 as the only favourite.

post #1129063

Results: 69

Ones that are similar to post #1129063 are in bold.
Ones I upvoted have a +1 next to them.

post #1130123
post #1120967
post #1110365
post #1109396
post #1108243
post #1107873
post #1103736
post #1102105
post #1094838
post #1068281
post #1067641
post #1062591
post #1062514
post #1051723
post #1039992
post #1020392
post #1014764
post #1012449
post #1012448
post #1007451
post #1003412
post #1003410
post #1143136
post #926188
post #922955
post #895957
post #886609
post #885965
post #873440
post #868304
post #866991
post #863211
post #855124
post #848532
post #846989
post #831601
post #831600
post #829632
post #821224
post #818749
post #811793
post #811792
post #810901
post #805945
post #798755
post #793296
post #793294
post #791465
post #456252
post #446069 +1
post #440449
post #415574
post #410373 +1
post #407441
post #396320
post #394274
post #376103
post #329325
post #325317 +1
post #321989
post #296025
post #293261
post #287682
post #248416
post #174957
post #1158931
post #1158923
post #1158866
post #1158816

17/69. Despite the only material it had to work with being one image of a mostly_nude solo male standing while holding_weapon and looking_aside, I got many images containing multiple characters, intersex characters, characters having sex and multiple intersex characters having sex. Also, the blacklist is not being taken into account if that particular intersex image is anything to go by.

Too bad it can't check my upvotes. Then again, it wouldn't know what to look for if it could.

-------------------

Test 2: 10 similar images from the following search: ~nude ~mostly_nude solo male standing sword

Found 120 this time. Maybe it finds more if your favourites are more specialized? Again, multiple characters, intersex and sex despite none of the favourites containing those things.

Has anyone had results that didn't contain intersex characters yet?

Also, after clearing out my favourites, the count is at 4. It should be 0.

Updated by anonymous

I think a lot of the issues people are having stem from the same basic issue: It is treating all tags with equal weight. However, I think special consideration ought to be given to "orientation" tags.

That is, the gender tags, the form tags (anthro, feral, humanoid, etc.), the gender-on-gender, gender-on-form, form-on-form tags. They should carry a special weight.

I'm not sure exactly how the "experimental" modes work, but I think one way to do what I'm suggesting would be to always include at least one of those tags from some favorited image, and in particular, it should always take the highest-order tag possible from an image.

Which is to say, a solo picture containing a male anthro would have two "single order" possible tags - "male" and "anthro". A duo picture containing a male anthro with a female feral would (if properly tagged) have four single-order tags (male, female, anthro, feral), and three "double-order" tags (male/female, anthro_on_feral, male_on_feral). So the way the program would work would be to choose from "male" and "anthro" if examining the solo picture but if examining the duo picture, it would ignore all of the single-order tags and only choose from "male/female", "anthro_on_feral", "male_on_feral".

The point of all this would be that, first of all, there would be a heavy focus on trying to match the user's orientation... But beyond that, this would also avoid issues where, if someone has, say, a lot of female solo pictures and male/female pictures, the program wouldn't see "male" and "solo" and figure "oh hey I'll give them a bunch of solo males!" when really the obvious conclusion from that mix of favorites is that their interests are just the opposite of that - a focus on females.

This entire post of mine is completely unwarranted but I am tired as heck and my window is being replaced so I can't sleep. Banging and noise, woo! Blah.

Anyway. I did want to offer some encouragement, to balance out what might be seen as negative... I do think this is a somewhat nifty project - I've considered doing similar things in the past, so I support what you're going for here.

And despite how picky I can be when it comes to enjoying art, This did actually manage to expose me to something I'd not seen before that I liked, which honestly even people who know my interests well can struggle with. So I do want to say "keep at it".

Updated by anonymous

^ What if the tags you talked about were treated as a group?

That is [male/female anthro feral duo] would be treated like a single tag, [female solo feral] would be treated like a single tag. A picture would either match all of the tags in a given group and be suggested, or not fully match any group and be rejected.

To me that seems to express what you actually want.

(although the solo/duo/group thing may be a bit fuzzy due to inadequate tagging..)

Updated by anonymous

With 97,806 favorites, mode 4 took 101.8454 seconds and "found 1 images I think you may like". Which were not shown.

I am also anxious to see zoranu post in this thread again.

Updated by anonymous

Clawdragons said:
I think a lot of the issues people are having stem from the same basic issue: It is treating all tags with equal weight. However, I think special consideration ought to be given to "orientation" tags.

That is, the gender tags, the form tags (anthro, feral, humanoid, etc.), the gender-on-gender, gender-on-form, form-on-form tags. They should carry a special weight.

Maybe that should also be valid for some other tag groups, like fetishes/"unusual" preferences (incest, bdsm, foot fetish, sizeplay, etc), character number (solo, group, large group, duo etc), large species groups (scalie, mammal, avian, crustacean, etc) and possibly others.

Updated by anonymous

savageorange said:
To me that seems to express what you actually want.

Not exactly.

Well, first off, if you treat them as one tag that has to be included, suddenly a lot of search slots are taken up. In your example, four tags would be used, using very little for whatever actual searches would normally be going on. I think limiting it to one or two is more reasonable.

Second I think a certain amount of flexibility is fine, as long as there's at least some precedent there.

Updated by anonymous

Just for shits and giggles, I tried the following individually: (v0.4)

1. post #1049426 - 0 results. No surprise here.
2. post #1005112 - 0 results. Kinda surprised. I figured it would find at least 1 other chair image.
3. post #974624 - 0 results. I think zero_pictured images stops it from working...
4. post #521328 - 0 results. Has a copyright this time (Star Fox) but still found nothing.
5. post #1144516 - 0 results. animate_inanimate this time, no species or characters.
6. post #1157879 - 0 results. Character and copyright tags present but no species tags (or many other tags for that matter).
7. post #1157971 - 3 results. Character and copyright tags present but no species tags. Lots of general tags. Finally found something but what exactly did it look for?
8. post #1159014 - 0 results. Species tagged, others not. I'm guessing character, species and copyright tags don't make a difference. Looks like solo was ignored as well.
9. post #286706 - 0 results. big_breasts tag present, but not much else. Low tag count seem to kill it.
10. post #315433 - 0 results. Zero pictured, ~40 general tags.
11. post #765898 - 6 results. Some basic gen tags present. All results had the same character, maybe character tags aren't ignored?
12. post #207584 - 0 results. Previous test without character tag.
13. post #1146671 - 23 results. Gentags only, over 45. Not zero_pictured. See results section.

Results:
What I learned:
  • zero_pictured kills it.
  • Displayed result count is always 1 higher than actual result count.
  • It only seems to care about certain general tags and ignore everything else, though test 11 suggests otherwise.
  • There may be a minimum tag requirement for gentags. Too few and it won't work.
  • Well-tagged images (>30 tags) are way more likely to be found.
  • There may be a priority system in place, ensuring at least one tag is in every result.

Try your favourites again and see if a tag shows up in every result. If it does, post it.

Random shit:
  • Gotta love the numbers in the results section:
    • Test 7 (lucky 7) with 3 results (good things come in threes).
    • Test 7 and 11 next to each other (7-Eleven ).
    • Test 11 had 6 results. The 6th prime number is 11 (1,2,3,5,7,11,13).
    • Test 13 (Unlucky 13) which had 23 results (23 enigma ).
    • 7+11-13 = 5 (Law of Fives)
    • 7, 11 and 13 are consecutive prime numbers.

Updated by anonymous

Clawdragons said:
Not exactly.

Well, first off, if you treat them as one tag that has to be included, suddenly a lot of search slots are taken up. In your example, four tags would be used, using very little for whatever actual searches would normally be going on. I think limiting it to one or two is more reasonable.

I assumed that the tool was doing its own filtering to a certain extent, and thus was not subject to that kind of limit.

For example, doing only the searches involving tags that are not 'special' in the way you describe.

Updated by anonymous

I've updated things a bit.

- Some tags are now weighted to have more or less of an effect on results
- Images you've already favorited should no longer appear

Let me know if these changes have made the results better or worse for you. They have been a little better for me, but if the changes have negatively affected everyone else's results I can adjust the weights appropriately.

savageorange said:
I assumed that the tool was doing its own filtering to a certain extent, and thus was not subject to that kind of limit.

For example, doing only the searches involving tags that are not 'special' in the way you describe.

At the moment, most of the filtering is done be the tool so I can avoid hitting the six tag search limit.

Updated by anonymous

Can you add an option to exclude images that the user has downvoted? The downvote would indicate that the user has already seen the image and very likely doesn't want it in their favourites.

Updated by anonymous

BlueDingo said:
Can you add an option to exclude images that the user has downvoted? The downvote would indicate that the user has already seen the image and very likely doesn't want it in their favourites.

Votes are not accessible unless logged in.

Updated by anonymous

Sorry for the downtime, had to fix an infinite loop issue.

An additional note: results are now returned in order of how accurate the tool thinks the match is.

I'm also still looking into why some users are getting no results.

Updated by anonymous

Redid some of the tests from forum #226121: (v0.5)

  • Test 7: 3 results then, 0 results now.
  • Test 11: 6 results then, 3 results now.
  • Test 13: 23 results then, 0 results now.

Also tried test 1 of forum #226081 again. 69 results then, 0 results now. Adding 4 similar images to it didn't help.

Tag list lacks spaces. Result counter is accurate now...sometimes. I'm guessing the ":13" in test 11's results is the similarity score. Looks like it doesn't like single image entries as much anymore.

Can you make the previous versions available again?

Updated by anonymous

Seems to work well though it'd be nice to have some sort of animate bar or widget or something to let you know the site's doing it's thing.

Shows me stuff I need to expand upon in my blacklist too, Though it did find quite a few things I had missed before. This is a pretty neat gadget.

Updated by anonymous

V0.5 test:
69 items generated, of which 4 were upvoted, and 2 were faved. (faved images are not included in upvote count)
These results are more understandable (ie. most images I can see why your system would choose it, even if it is not actually that interesting to me) so you might be on the right track.

Use of blacklist would probably eliminate most of those 69, in my case, but I guess it's a bit early to get into that.

Updated by anonymous

One thing I noticed with this most recent version is that it seems to aim for posts that have a lot of tags that would be found among my favorites.

To put it another way, the search is biased in favor of big flashes or giant images which are composites of many smaller scenes, because those tend to have a ton of tags on them for each of the individual pairings.

Maybe that was just my results though. Did anyone else notice anything similar?

Updated by anonymous

BlueDingo said:
Makes sense, though. Images with lots of tags are more likely to contain the tags the favorite finder is looking for.

At the same time, it's more likely that the image contains things that aren't relevant to a person's interest. I wonder if it could be made so that it tries to find a proportional value for a post - that is to say, an image would be considered more relevant over another image if they both contained the same "desired" tags, but the former contained fewer other tags. Which is to say that the proportion of tags would be more important than the absolute number.

Updated by anonymous

Clawdragons said:
At the same time, it's more likely that the image contains things that aren't relevant to a person's interest.

Not really. The unwanted elements may not be tagged, and you can't filter out something that hasn't been tagged yet.

Updated by anonymous

BlueDingo said:
Not really. The unwanted elements may not be tagged, and you can't filter out something that hasn't been tagged yet.

But that's not really... You could just as easily argue that wanted elements might not be tagged, so searching for those is pointless.

Sure, some posts will be mistagged or under-tagged, but that is a fundamental problem here and doesn't affect this idea really any more than other ways of doing things.

Updated by anonymous

Clawdragons said:
But that's not really... You could just as easily argue that wanted elements might not be tagged, so searching for those is pointless.

Sure, some posts will be mistagged or under-tagged, but that is a fundamental problem here and doesn't affect this idea really any more than other ways of doing things.

If only we had more people working on fixing that. There's currently ~67000 gentags:<10 images.

Updated by anonymous

Clawdragons said:
At the same time, it's more likely that the image contains things that aren't relevant to a person's interest. I wonder if it could be made so that it tries to find a proportional value for a post - that is to say, an image would be considered more relevant over another image if they both contained the same "desired" tags, but the former contained fewer other tags. Which is to say that the proportion of tags would be more important than the absolute number.

That seems like it would reward lack of tagging more than anything else.
I mean, for example, I don't really care about hair/fur/whatever color, so if adding $FOO_(hair|fur|scales|...) to a picture would make it a worse match, that seems pretty clearly absurd. Other people do seem to care about this particular dimension, so simply discounting it wouldn't work either.

More speculatively, I would suggest that most tags are actually not important to the searcher most of the time. We have so many tags, this seems somewhat unavoidable.
Perhaps sorting tags by # of occurrences in your favs, and assigning weight=0 to the smaller half of this list, could address this.

Alternatively, a two-stage process might be better?
First, generate a list of tags that you might be interested in, sorted by confidence (which might just be 'how many times it occurs in your fav list)
Allow you to edit this and finally submit it.
Then, generate the actual suggestions based on the submitted list. It would assign 'weights' to tags in inverse proportion to their index in the list (weight=1 for final item in list, weight=(list length) for first item in list).

Updated by anonymous

I've added a local blacklist function, so you can now try that and see if it helps with the bad suggestions.

Updated by anonymous

I'm still looking at the 99 results returned to me by v0.5.0, but you can tell that the highest scored images are flash animations, model_sheet, and multiple_images posts that tend to have lots of tags. At least, that's the case with my 98k favorites list. Larger favorites list could mean more tags earning points, and more points going to well or overly tagged posts.

Updated by anonymous

Lance_Armstrong said:
I'm still looking at the 99 results returned to me by v0.5.0, but you can tell that the highest scored images are flash animations, model_sheet, and multiple_images posts that tend to have lots of tags. At least, that's the case with my 98k favorites list. Larger favorites list could mean more tags earning points, and more points going to well or overly tagged posts.

Why did you make search links instead of wiki links?

At least there's a blacklist feature now. You can add model_sheet, multiple_images, absolutely_everyone, etc. to it to reduce the chances of posts with massive tag counts from showing up.

A favourites list of one doesn't seem to work at all anymore. I've messaged zoranu about it.

Updated by anonymous

BlueDingo said:
At least there's a blacklist feature now. You can add model_sheet, multiple_images, absolutely_everyone, etc. to it to reduce the chances of posts with massive tag counts from showing up.

Just for shits, I tried tagcount:>10 on the blacklist. Then I put Genjar in. As expected, it didn't work. I doubt we support tagcount on the real blacklist either.

I added the ones you suggested along with animated. It returned some good results, like post #624142. It also returned 148 results, so no more detailed breakdowns from me. I found another tag to block: multiple_scenes.

The paragraphs with "Total: score" don't need a height of 155px. Unless tags get added back in.

Updated by anonymous

Don't want to be mean but this thing doesn't seem to work at all.

Maybe it's sorta my fault, I don't have a lot of shit in my favorites and if I do it's mostly just high-quality art (See: shamanguli) and mostly pony stuff the further down you get. (All the femboy stuff starting semi-recently)

But the site you provided gives me stuff like this:
post #742129 (A pokemon animation, I don't have a single pokemon pic in my fav)
2 more pokemon animations after that wtf?

post #1085938 This is a hyper fetish. (hyper body part = giant boy parts) this is not only a huge turn off for me but never have a fav'd an image with this shit included.

post #1113222 Turn off, nothing close to what I've fav'd

post #1079752 Sonic? really? Nothing close.

post #1123557 More hyper body parts.

Where are all the femboys? Where are all the ponies I'd expect to see based on favs list? And the big scale images that are mostly just a show of art skill (Again see the artist I mentioned)

I keep getting recommended pics like this:
post #1123503

Nothing at all like what I like or have fav'd

Maybe I just don't have enough pics for the site to recommend but I haven't seen a single pic on this site that I actually enjoy looking at.

Updated by anonymous

Every version so far has returned some unwanted images. Using the recently added blacklisting feature should improve the results a bit.

Updated by anonymous

After applying the blacklist I get some good images. Then I got a large amount of Zootopia posts. Specifically Judy. I'm not that into the series but it seems the few faves have triggered them. Then I got a large amount of anthro pinup pics. Those are not that interesting to me most of the time. I can see why I got them because there are so many solo anthro pics out there so your bound to get some faves of them.

Get a new fave from the first resoults :)
post #1082185

Updated by anonymous

I would suggest explicitly excluding child posts to avoid cluttering the results with close duplicates. That will also allow for better spread on the results. I do notice a strong bias towards posts with lots of tags for matching. I'm guessing this is a result of you aggregating the tags from favorited posts and then counting the number of matches against posts. You might want to try and normalize the results of this based on total tag count on the post. Since my interest in a post doesn't depend on how many distinct subjects it touches, but how much it focuses on my core interests.

Updated by anonymous

Some of the recent posts here make me think there are still problems with the weighting.

The following may or may not help (since I don't know the details of the algo), but hopefully it provides some inspiration at least:

  • extending the 'sex/pairing/bodytype are specially weighted' idea:
  • Instead of giving each tag a weight, give each tag pair a weight. These weights could be generated somewhat like this (Python):
# a simple class for counting things. If you didn't have this, you could emulate via additional guard statements eg 'if ordered in weights: weights[ordered]= weights[ordered] + 1 else: weights[ordered] = 1'
from collections import Counter
weights = Counter()
for post in favorites:
    # used to avoid double-counting pairs.
    counted = set()
    for tag1 in post.tags:
        for tag2 in post.tags:
            if tag1 == tag2:
                continue
            # avoid double counting correlations
            ordered = tuple(sorted([tag1, tag2]))
            if ordered in counted:
                continue
            weights[ordered] += 1

# do any multiplication here, eg. increasing the weight of pairs that include sex, pairing, or body type.

# perhaps also normalize the weights as it may be possible to get absurd scores otherwise.

[...]

# to score a post:
counted = set()
score = 0
for tag1 in post.tags:
    for tag2 in post.tags:
        ordered=tuple(sorted([tag1,tag2]))
        if tag1 == tag2 or ordered in counted:
            continue:
        score += weights[(ordered)]

I've actually considered this problem in the past.. this is just a basic approach to the concept of correlating tags. My ideal would probably examine the user's favs and calculate entire 'tag clusters' that represent strong themes in their favs; I'm not yet sure how to do that properly though.

A somewhat simpler yet harder to write out approach: weight[(tag1, tag2)] == global correlation of tag1+tag2. ie. if tag1 and tag2 were both on all faved posts, weight would be 1.0. if tag1 and tag2 were both on half the faved posts, weight would be 0.5, etc.

Updated by anonymous

Sorrowless said:
After applying the blacklist I get some good images. Then I got a large amount of Zootopia posts. Specifically Judy. I'm not that into the series but it seems the few faves have triggered them. Then I got a large amount of anthro pinup pics. Those are not that interesting to me most of the time. I can see why I got them because there are so many solo anthro pics out there so your bound to get some faves of them.

Get a new fave from the first resoults :)
post #1082185

Good eye. post #1082185 is already in my ultra favorites.

I don't think we can assume that about the Judy/Zootopia images until we know more about how the algorithm works. But the size of the favorites list seems to have a big impact.

Updated by anonymous

Lance_Armstrong said:
But the size of the favorites list seems to have a big impact.

Size, or content variety? If all your favourites are basically the same thing then chances are the favfinder's search results will be more focused on one content type, but if your favourites have a bit of everything then it probably won't know what exactly to look for and returns a bit of everything. At least the blacklist lets you control this somewhat.

Updated by anonymous

Sorrowless said:
After applying the blacklist I get some good images. Then I got a large amount of Zootopia posts. Specifically Judy. I'm not that into the series but it seems the few faves have triggered them. Then I got a large amount of anthro pinup pics. Those are not that interesting to me most of the time. I can see why I got them because there are so many solo anthro pics out there so your bound to get some faves of them.

Get a new fave from the first resoults :)
post #1082185

I see you have found the kemono god known as Kikurage. You've made a great discovery.

Updated by anonymous

zoranu said:
I've added a local blacklist function, so you can now try that and see if it helps with the bad suggestions.

Helps out a ton Ty. <3

Updated by anonymous

I don't know what 3 methods you talk about since I only see one in the site. Sorry if I am missing something, I can be dumb with these things sometimes.

It did not find much, and it found many anthro and anal (both things I dislike, especially the latter). It found a couple of things that fit my preferences but they were poorly drawn (amateur art) and things I have already seen before.

However my preferences are very specific and very unfetishistic so it doesn't surprise me at all, I was expecting this. There's not much in this site or in r34 in general that I favorite, so I guess it can help others a lot more than me, so I respect it.

Updated by anonymous

ckgjkjj6 said:
I don't know what 3 methods you talk about since I only see one in the site. Sorry if I am missing something, I can be dumb with these things sometimes.

In the first version, there was 3 methods. Since then, this has been updated to have one, (hopefully) more accurate method.

Updated by anonymous

BlueDingo said:
I take it this is not being worked on anymore?

It is, but I really only have time to work on it on my days off.

Updated by anonymous

well this sounds interesting.

started with the 6.0 search

332 faves here, 177 results after entering my blacklist, and... wow, post #1147705 is probably the best jake long post i've seen so far. oh, it's one of narse's posts, well no wonder it's of such good quality.

post #630014 ok... think i've seen that particular version of HTH before.

post #478078 :/ meh

post #9419 o.O why is the artist tag nobuyuki when the huge red artist sig is zomay? well, i think that's a sig. and this artists art style is all over the place. in most posts there is no less than 3-5 different art styles. i also suspect that post is missing at least one character tag...maybe. that character with his shorts down looks familiar.

post #45624 eh, never cared much for bowser porn. preferred koopaling and other koopas. should this pic have the hyper_muscles tag or something?

not bad so far. seen some i like (who doesn't like narse level of quality?) and some not so much and a good amount of...meh.

post #110065 well, that was...different and interesting.

powerpuff girls? ...skipping that one...

post #1165625 newgrounds...pussymon....nope. and these sketch posts aren't particularly to my liking but they do occasionally have good stuff in them.

what is this? while attempting to locate the artist of post #8406 i wound up here. note, the 134 x 150 e621 link actually leads to post #747939 for some reason.

post #806255 lol funny if mean

flookz, smilebomb, never heard of them but pretty good quality.

post #50402 and this is...?

whitemantis... bugs bugs and more japanese bugs... but it's still quite good quality regardless.

post #692279 quality pic, ruined by trashy quality cum. >.<

now for 5.0... pretty much the same results, just in a somewhat different order.

regarding the thumbnails in the results. if you can, you may want to replicate how e621 handles them as some tall or wide ones show up looking shished and weird in the results.

Updated by anonymous

Just uploaded an update, mostly cleaning up the front end, but there are some back end changes as well.

Changes:
-Added status indicator to the page title. You can go do something in another tab/window and the title will change to "FavoriteAnalyzer - Done" when results are ready.
-Cleaned up result layout
-Added tag searching. Works like e621's built in search, but filters and orders based on weights from looking at your favorites.

Updated by anonymous

Glad to see it works for single-favourite accounts again. The last few versions didn't.

Time to test...

post #1187748 returned over 200 results. It seems to be handling undertagged images better than before.

Interesting. I managed to get 406 results from a barely-tagged blank page, many of which featured characters. I'm guessing the zero_pictured tag wasn't taken into account, though I don't see much reason for why it should when you consider most of our content contains characters.

Updated by anonymous

Heeey that's pretty good

I like this service, thank you!

Updated by anonymous

Some feature suggestions:

  • Converting the relevance score to a percentage and excluding results below a given percentage (eg. user enters 50%, feature excludes images that are <50% relevant).
  • An ordered list for how frequently tags show up in your favourites. It might be interesting to see which ones show up the most (besides "anthro" because you kinda expect that one).
  • The ability to enter a post ID and have the feature attempt to find potential favourites based on that post. Useful if you want posts based on one particular favourite.

Updated by anonymous

Updated again, mostly optimisations. Some requests in the old version were taking over 2 minutes to return results. New version (in my tests at least) is now down to < 30 seconds for most requests.

Updated by anonymous

Yep, it's faster.

Interesting, my test run returned a lot of images I've previously tagged, which probably wouldn't have showed up if I didn't tag them. Looks like my tagging efforts are actually having an effect.

Updated by anonymous

have you published the algorithms which make the program work? right now i see a black box, which may be the cause of different levels of effectiveness for different users.

Updated by anonymous

Tried it out and this is literally the first thing that came up
post #1208044
How the plumbus does the program know I'm a fighting game nerd?!
On top of that Know what fighting game I've been wanting to play
longer then Bayonetta 2 or Smash 4 0_0;
Seriously that fact about me is less

But in all seriousness lolz ◠‿◠)
(And frankly straddling coincidence aside T wT;)
It's cool but it still needs a bit of work.
Looks up my name and hardly any guys in undies popped up. That's like
my whole thing/preface when if comes to looking up stuff on 621 ◠‿╹)

Updated by anonymous

Just tried it and found a lot of great new images and a comic. Could it be that my Favorites list is small compared to most? I only favorite pictures I really really like with occasionally one per artist with their best image. I don't favorite often either typically one or two per month if even that. If your favorites is too big I believe the site doesn't work as good because there's too much to work with.

Updated by anonymous

kimjoy said:
Could it be that my Favorites list is small compared to most?

Yes.

kimjoy said:
If your favorites is too big I believe the site doesn't work as good because there's too much to work with.

Good thing you can search your favourites just like anything else, then. If you want specific ones, just add more tags to the search.

Updated by anonymous

Tried a test with over 1000 images and found two issues:

1. It really seems to like searching recent uploads, so much so that it won't show anything else unless forced to via id:<#.
2. Using the blacklist makes searches take way too long for some reason. I let it run for 15 minutes straight before giving up. The blacklist at the time had only 1 tag in it.

Neither issue occurred during the one image tests. Also, giving it only one image to work with delivers more focused results... usually.

PS. I think I broke my favourites counter. It says -13.

Updated by anonymous

BlueDingo said:
Tried a test with over 1000 images and found two issues:

1. It really seems to like searching recent uploads, so much so that it won't show anything else unless forced to via id:<#.
2. Using the blacklist makes searches take way too long for some reason. I let it run for 15 minutes straight before giving up. The blacklist at the time had only 1 tag in it.

Neither issue occurred during the one image tests. Also, giving it only one image to work with delivers more focused results... usually.

PS. I think I broke my favourites counter. It says -13.

Thirteen images got deleted before the favorite could be unregistered, and so it became a negative once it did.

Updated by anonymous

  • 1
  • 2