Topic: Generic Tags

Posted under Tag/Wiki Projects and Questions

We seem to have a lot of problems with tags such as.

red_hair
green_hair
blue_hair
...

red_eyes
green_eyes
blue_eyes
...

red_stripes
green_stripes
blue_stripes
...

red_lips
green_lips
blue_lips
...

english_text
russian_text
japanest_text
...

small_breasts
big_breasts
huge_breasts
...

small_balls
big_balls
huge_balls
...

red_background
green_background
blue_background
...

anal_penetration
vaginal_penetration
nipple_penetration
...

Needless to say, the amount of such tags we have is absurd.

--
So, I propose we modify the system and introduce a feature called "Generic Tags".

The word generic means tags that are "generalized". That means, a tag that can work in multiple ways and supports parameters (similar to how generic types work in programming languages). The exact notation and searching notation is of course part of discussion, but for now let's use '<' and '>'.

For red_hair, blue_hair, etc. we can create the following generic tag:
- <color>hair, which means hair is a generic tag with a parameter color
- red implies color. This means red is a type of color.

For small_breasts we can create <size>breasts, which means breasts are a generic tag which supports a parameter size where small is a type of size.

For anal_penetration we can create <penetration_type>penetration, which means penetration is a generic tag that has one parameter, namely penetration_type. anal is a type of penetration. More about this further down.

For english_text we can create <language>text where text is a generic tag with a parameter language.

--
This way, we wouldn't need a billion tags for all the types of hair, types of penetration, types of background, types of text, etc, but only a single tag for each.

There's another feature that can be introduces with this that I would call Tag Safety.
When a user enters a tag hair, the system can warn the user to include the tag's parameter, the hair's color. Of course, hair's color can only be tags which imply color, and the system will not allow the user to enter <finger>hair. In fact, the system can offer the user a dropdown list of all the possible colors and allow the user to choose from the list instead of writing it, which would make tagging not only easier, but also less prone to mistakes .

Also, there's another feature that comes to my mind which stands out (at least for me). There would be a way to add gender property to characters, so the users would be able to find a specific gender character in multi-character images. For instance, right now if you search for male rainbow_dash duo, most of the time you're going to get female rainbow dash with another male character. But if we searched for <male>rainbow_dash, then you would be guaranteed to find a male rainbow_dash character.

--
Some other examples I found with a short browse through tags:
year<year_number> - 2015 / 2014
cum_in<body_part> - cum_in_ass, cum_in_pussy..
cum_on<body_part> - cum_on_chest, cum_on_breasts..
<body_part>tuft - chest_tuft, ear_tuft..
<body_part>penetration - anal_penetration, ear_penetration..
<depth>penetration - deep_penetration, shallow_penetration..
<color>penis - blue_penis, red_penis..
<color>pawpads - blue_pawpads, red_pawpads..
<color>shirt - blue_shirt, red_shirt..
<body_part>in_mouth - tail_in_mouth, finger_in_mouth..
on<lying_position> - on_back, on_side..
<body_part>mouth or living<body_part> - tail_mouth, ear_mouth..

--
Some examples of generic tags using two or more parameters. Oh, did I forget to mention generic tags can have more than one parameter? :P
<gender>on<gender> - male/male, male/female.. - For sex.
<body_part>in<body_part> - penis_in_penis, tail_sex, vaginal_penetration.. - When something is inserted into something. A very important and versatile tag.
<sex_toy>in<body_part> - anal_insertion, vaginal_insertion.. - When a sex toy is inserted into something.
<gender>on<gender>on<gender> - For threesome sex.
<anthro_level>on<anthro_level> - human_on_feral, feral_on_feral..
<body_part>on<body_part> - hand_on_head, cum_on_breasts.. - Hmm, not sure if cum counts as a body_part. It does come from the body..
<sex_toy>on<body_part> - vibrator_on_penis..

--
Pros and cons:
The benefits (pros) of generic tags are the following:

  • Reduced amount of tags - The amount of tags can be greatly reduced if it's decided that full backwards compatibility is not necessary. For instance, all vaginal_penetration tags can be replaced with <body_part>in<vagina>. In this case, all the *_penetration tags can be removed and wiki articles merged.
  • Lesser wiki redundancy and better consistency - Having less tags would also mean mean less wiki articles since you only need one wiki article per one generic tag, and the wiki article would be consistent for all the parameter combinations.
  • Umbrella tags - Generic tags automatically act as umbrella tags for all the parameter combinations. For instance, <color>hair automatically serves as an umbrella tag for all color hair tags and if the user can search for <color>hair without specifying color. On the other hand <body_part>in<body_part> automatically serves as a 3-way umbrella tag. For instance, <penis>in<pussy> can be found using <penis>in<body_part> or <body_part>in<vagina> or <body_part>in<body_part>.
  • Better searchability and search accuracy - Due to the above point, searching for generic tags is much more powerful. Even if you tag something obscure, you won't need to create a new tag for it and the tag will be findable in more than one way.
  • Reduced resource usage - Due to the fact that generic tags act as umbrella tags, there is much lower need for SQL commands that contain the LIKE keyword. If a user wants to search for any colored hair, then he won't need to search for *_hair to get the results he wants. This mean the database can use the table index and instantly return a single result instead of having to slowly search through all the tags.
  • Drop down menus and autocompletion - Parameters of generic tags are a perfect place to use drop down menus and offer the user a choice of the parameter value. Having drop down menus has obvious benefits such as no mistakes, no tag knowledge required and no writing required.
  • Increased tagging simplicity - There's 3 points to this:
    • Users need less tagging knowledge. And the less tag knowledge the user needs, the simpler the tagging gets.
    • Less is more. You can describe image better and using less tags (with multi parameter tags).
    • generic tags can follow "tag what you see" principle much better than constructed tags (penis_in_vagina vs penis + vaginal_penetration)
  • Increased tagging amount - Needing lesser tag knowledge means having bigger tag knowledge. The bigger the tagging knowledge of users, the more tags they will add. Also the simpler the tagging is, the more tags will get added. On average.

Drawbacks (cons):

  • More than 5 minutes of work is required. Drat.
  • Brain is required. Double drat.
  • Someone needs to do it. Triple drat.

In other words, besides the coding and database work, generic tags would need to be created, and often their parameters also. Also conversion of existing tags into generic tags would be required through careful scripting. Other than the work required to make changes, I don't see any drawbacks of adding the power of generic tags to the current system. For users things would be simpler, for admins the things would be a bit more complex. Or would they be..

Edit: Updated with discussion findings, added pros and cons.

Updated

It's definitely an interesting idea. The only problem that I can see coming with this, if it would ever going to implemented, is that it would probably involve a drastic overhaul of the current tag system(or not, I'm not a coder), and seeing how people are currently used to using the current system, I doubt many are willing to change along with it.

But once again, it's definitely an interesting idea, and I wouldn't mind using it.

Updated by anonymous

The change can be 100% transparent and backward compatible. If for instance, we set the separator between the generic tag and parameter to be underscore (_), then red_hair would function pretty the same, where in fact it would be two separate tags. However, for clarity sake I've been using different separators.

Updated by anonymous

This is an extremely interesting idea

Don't know if it's similar to that tag group thing they have planned a couple years ago

From small_breasts we get <size>breasts, which means breasts are a generic tag which supports a parameter size where small is a type of size.

Yes please
We currently have no tag for 'medium breasts' just large+ and small-, so this would work great

Updated by anonymous

titanmelon said:
We currently have no tag for 'medium breasts' just large+ and small-, so this would work great

I'd rather use "average" than "medium", but I can see the usefulness.

Updated by anonymous

As someone who codes (and understands databases):

  • Regarding completion, it might be more doable when we have an enumeration system like you are describing. Otherwise any completion implementation would have to ask the server to search through hundreds of thousands of tags each time the user types -- which is probably a prime reason tag completion hasn't already been implemented.
  • I would be against this part:

If we turned every character tag into a generic tag with a parameter gender (eg. <gender>rainbow_dash), then users would finally be able to find a specific gender character on multi-character images! eg. <male>rainbow_dash in a male/female image where most of the time it's usually a female rainbow_dash.

as a) parametrization proliferation is not good (consider that some people will want <feral>rainbow_dash or <anthro>rainbow dash, or <macro>rainbow_dash, etc), and b) Char's tag grouping proposal addresses this more directly and cleanly (one group per depicted character, allowing you to search for any combination of characteristics on a single character, not only the 'popular' ones).

  • Multiple parameters are a sticking point IMO. There are ways to do this, but AFAIK they are either inefficient or awkward.

Updated by anonymous

Circeus said:
I'd rather use "average" than "medium", but I can see the usefulness.

Average implies a standard

In any case;
small, average, large vs
small, medium, large etc fits the nomenclature

Updated by anonymous

Oh, finally someone who understands a bit more about what I wrote.

Regarding completion, it might be more doable when we have an enumeration system like you are describing. Otherwise any completion implementation would have to ask the server to search through hundreds of thousands of tags each time the user types -- which is probably a prime reason tag completion hasn't already been implemented.

If the completion only works on the parameters of generic tags, then it's simple. No enumerations required.
For the tags, you have a column "implicates_tag", right? You index this column, so you get fast searching. Also, the completion wouldn't be offered every time you type, but only when the tag contains the symbol used for generic tags. Furthermore, you can add client cache that contains completion entries and only refresh it if an actual change has been detected. Lastly, you only trigger completion a few seconds after the user stops writing. So this trivial problem is solved.

In other words, a few seconds after the user stops writing you check if the tag contains the symbol used for generic tags, and if the cache, which contains tags and completions, doesn't contain this tag yet. Only then you trigger completion server query. And then this server query is completed instantly because everything you look for is indexed.

I would be against this part: (turn every character tag into generic tag)
as a) parametrization proliferation is not good (consider that some people will want <feral>rainbow_dash or <anthro>rainbow dash, or <macro>rainbow_dash, etc), and b) Char's tag grouping proposal addresses this more directly and cleanly (one group per depicted character, allowing you to search for any combination of characteristics on a single character, not only the 'popular' ones).

You don't understand. This is the exact feature you need to show tag groups. Not just character, but any groups. It's the exact framework you need and if you have it, then character grouping becomes a trivial problem:

You have 3 tags:

  • <feral>rainbow_dash
  • <macro>rainbow_dash
  • <female>rainbow_dash

How you show them is up to you. Showing them in groups is no problem:

  • rainbow_dash
    • feral
    • macro
    • female

And how you parse them is up to you. You could just as easily make it so that if you parse the text "rainbow_dash(feral, macro, female)" you would generate those 3 tags.

Parametrization proliferation (funny choice of words) is good. Tags become more powerful. Search becomes more powerful. Why would it be bad?

Multiple parameters are a sticking point IMO. There are ways to do this, but AFAIK they are either inefficient or awkward.

I'm not yet sure about the details of multiple parameters, but once we get down to it, I'm sure things will turn out to be quite trivial ^^

Updated by anonymous

this has been "planned" since site launch essentially

that means it'll never happen, don't hold your breath

Updated by anonymous

Delian said:
Oh, finally someone who understands a bit more about what I wrote.

If the completion only works on the parameters of generic tags, then it's simple. No enumerations required.

Lol?
You are saying in one sentence 'parameters of generic tags', and in the next 'enumerations', like they're different.

When you say that <color> means {red, green, blue, aqua, brown, etc} you are defining an enumeration, as far as I can see. That's what an enumeration is. Anytime you say 'valid values for X are {Y,Z,W} and no others' you are specifying an enumeration.

For the tags, you have a column "implicates_tag", right? You index this column, so you get fast searching.

.. are you or are you not proposing to swap out the current tag system for a parametrized tag system?
As I understood it, when migrated to your system, a tagging could no longer be represented by a single integer, but would require multiple integers -- a template id, and a 'parameter value' id.

Also, the completion wouldn't be offered every time you type, but only when the tag contains the symbol used for generic tags. Furthermore, you can add client cache that contains completion entries and only refresh it if an actual change has been detected. Lastly, you only trigger completion a few seconds after the user stops writing. So this trivial problem is solved.

The latter solutions are obvious (at least to me). But you don't seem to realize that I am saying that your proposed system would make completion handling -easier-.

You don't understand. This is the exact feature you need to show tag groups. Not just character, but any groups. It's the exact framework you need and if you have it, then character grouping becomes a trivial problem:

You have 3 tags:

  • <feral>rainbow_dash
  • <macro>rainbow_dash
  • <female>rainbow_dash

How you show them is up to you. Showing them in groups is no problem:

  • rainbow_dash
    • feral
    • macro
    • female

And how you parse them is up to you. You could just as easily make it so that if you parse the text "rainbow_dash(feral, macro, female)" you would generate those 3 tags.

I do understand. I'm no longer sure what the hell you are proposing (I don't believe you talked of 'tag generation' before, I thought you were talking about changing how tags were stored.), but I do understand that it's trivial to transform [(a,b),(c,b)] -> {b: {c, a}} or whatever else you want..

Parametrization proliferation (funny choice of words) is good. Why would it be bad? Tags become more powerful. Search becomes more powerful.

Power is only good if you can use it. Usability is hard and you need to try to make a solution that everyone can approach readily.

Something like <color>_hair is easy to understand. Then you might have <hairstyle>_hair, maybe, and hair_<hair_decoration>. You basically hit everything with those three, right? Fine, good. Easy to understand.

But rainbow_dash (and character tags in general) is not that type of tag. There are hundreds of possible <foo>_rainbow_dash combinations. The result being that you end up with meaningless or indefinite groups so that you can specify things like balloon_rainbow_dash, blob_rainbow_dash. Are blobs or balloons species, or morphologies? Sex acts, clothes, species, colors, camera angles.. ultimately a majority of the tags we have describe character traits or interactions.. 'things about characters that people want to search'. Okay, fine. Now let's add a second rainbow_dash. How do you search the parameters associated with them?

...

IOW, you can fake groups with parametrization. You can't actually achieve real groups via parametrization. This is testable using existing tools, like TMSU's tag=value support.

In case I wasn't clear before:

  • I think parametrizing _hair , _fur and such is sensible -- the relationships are few, easy to understand and non-controversial. I have, in fact, effectively had a superset of it implemented in my local TMSU system for quite some time.
  • I think that parametrizing character tags is not. It may not be a terrible idea (contingent on people managing to tag in a consistent, clear, and uncontroversial way), but it's far from being a gimme. We should really try hard to make absolutely sure that features we add don't make things more confusing -- those would really just be misfeatures in that case.

Updated by anonymous

When you say that <color> means {red, green, blue, aqua, brown, etc} you are defining an enumeration, as far as I can see.

Nope. When I say color, I mean tags that imply color. If you want programmers term, then it's sub-classing, extending or implementing a class.

.. are you or are you not proposing to swap out the current tag system for a parametrized tag system?
As I understood it, when migrated to your system, a tagging could no longer be represented by a single integer, but would require multiple integers -- a template id, and a 'parameter value' id.

I'm proposing a backwards compatible extension to the current system, to add support for tags with parameters.

Eh, you're going into too much details here and I don't know exactly how e621 works right now. How tags are stored and searched for. I will go look it up and reply to you then.

Updated by anonymous

I currently don't see how sub-classes in this case would be better than enums.
As * is a thing (https://e621.net/help/cheatsheet) I assume tags are stored in a Database like SQL, so I'd not even use enums.
Giving tags a category column and internally expanding the list of searched tags?

<param>_basetag -> SELECT *[_]basetag FROM tags WHERE category LIKE param -> ~row1_basetag ~row2_basetag ~...

Updated by anonymous

Ok, I read a bit about ouroboros.

@savageorange

I'm no longer sure what the hell you are proposing

Forget about tag generation. That was just trivial part of text parsing I made up on the fly and it's not important right now.

Based on how the current database is structured, I'm proposing adding a new table tags_parameters, which holds parameters info.

There are hundreds of possible <foo>_rainbow_dash combinations. The result being that you end up with meaningless or indefinite groups so that you can specify things like balloon_rainbow_dash, blob_rainbow_dash. Are blobs or balloons species, or morphologies? Sex acts, clothes, species, colors, camera angles.. ultimately a majority of the tags we have describe character traits or interactions.. 'things about characters that people want to search'.

You're absolutely correct. But the problem isn't with the fact that you could create balloon_rainbow_dash. I'm sure character parameters would be minimized only to the important ones. So ending up with baloon_rainbow_dash would not be possible. Currently you can add any tag to any post so what can be worse than that ;)

The real problem is that you would need to create new parameters for every character every time. And if you have 30 possible character parameters, and 10000 characters, it's quite a problem.

The solution would be creating a character tag. And every character would then imply this tag. This character tag would be the tag that has parameters and any tag that implies character can use character's parameters. Sounds complicated :P

Now let's add a second rainbow_dash. How do you search the parameters associated with them?

If the user searches for a rainbow_dash with a penis in her hair, and the image has two rainbow dashes, then it doesn't really matter which one has penis in her hair, as long as one has it, then it matches user's tastes :D. Yes, you can't perfectly tag images which contain multiple same characters. But can you give me an example where this is a problem?

I think parametrizing _hair , _fur and such is sensible -- the relationships are few, easy to understand and non-controversial. I have, in fact, effectively had a superset of it implemented in my local TMSU system for quite some time.
I think that parametrizing character tags is not. It may not be a terrible idea (contingent on people managing to tag in a consistent, clear, and uncontroversial way), but it's far from being a gimme. We should really try hard to make absolutely sure that features we add don't make things more confusing -- those would really just be misfeatures in that case.

I also think that giving parameters to characters would be too big of a first step and we should start with small things first.

--
@rebane

I currently don't see how sub-classes in this case would be better than enums.
As * is a thing (https://e621.net/help/cheatsheet) I assume tags are stored in a Database like SQL, so I'd not even use enums.
Giving tags a category column and internally expanding the list of searched tags?

<param>_basetag -> SELECT *[_]basetag FROM tags WHERE category LIKE param -> ~row1_basetag ~row2_basetag ~...

A single column is not enough to support this feature because one tag can have several types of parameters (<gender>character or <species>character), and can have more than one parameter <male, female>sex or <male, female, female>sex. Also, LIKE is a very database intensive operation and should not be used under normal circumstances.

Updated by anonymous

Delian said:
Ok, I read a bit about ouroboros.

@savageorange
Forget about tag generation. That was just trivial part of text parsing I made up on the fly and it's not important right now.

Based on how the current database is structured, I'm proposing adding a new table tags_parameters, which holds parameters info.

Okay, this stuff is good. See, your initial posts were setting off a lot of warning signs.. Mostly 'this person knows enough programming to be dangerous, but not enough to tread cautiously when considering adding complexity to implement new features.'. But when you describe specifics like this, it's both easier to understand exactly what you are proposing, and reassuring that you are not totally feature-crazed ;)

I would somewhat expect two tables, to support a one-to-many mapping (first table defines basic info like id and name, second connects id with parameters)

You could do it with one (with multiple NULLable parameter type fields, param1 param2 param3 etc), but that is terrible and I'm afraid I would have to shoot you if you did that ;)

You're absolutely correct. But the problem isn't with the fact that you could create balloon_rainbow_dash. I'm sure character parameters would be minimized only to the important ones. So ending up with baloon_rainbow_dash would not be possible. Currently you can add any tag to any post so what can be worse than that ;)

Well, the blob_rainbow_dash example was not so hypothetical -- blob ponies is actually a thing. They're sort of an 'insanely chibified' version of ponies, I guess.

Not sure how to find them currently, this is the only post I could quickly find with a blob pony.

The solution would be creating a character tag. And every character would then imply this tag. This character tag would be the tag that has parameters and any tag that implies character can use character's parameters. Sounds complicated :P

Doesn't that then mean that you have destroyed the grouping ability? Since you can only have one foo_bar tag on an image, if you attach everything to the character tag, then you have placed it all in one group, which is essentially the same situation we currently have, with the same searching problems.

If the user searches for a rainbow_dash with a penis in her hair, and the image has two rainbow dashes, then it doesn't really matter which one has penis in her hair, as long as one has it, then it matches user's tastes :D. Yes, you can't perfectly tag images which contain multiple same characters. But can you give me an example where this is a problem?

The scheme you proposed previously is close enough that it would work for most situations. Not close enough to be equivalent to tag grouping though (which would do this job in a way that is both simpler and more flexible than attempting to adapt parametrization to the job).

Just trying to point out that if you claim weakly related features as a 'side effect', then you can get into a big mess trying to make these 'corner cases' work well -- it's better to stay intentionally modest in scope IMO.

@rebane
A single column is not enough to support this feature because one tag can have several types of parameters (<gender>character or <species>character), and can have more than one parameter <male, female>sex or <male, female, female>sex.

.. Oh, I didn't cover this before. I'm also not particularly in favor of multiple parametrization. Every additional complexity adds to the amount of maintenance in the future, so personally I would be looking for concrete stats on how many current tags would actually be a good fit for multiple parametrization.

Currently I see:

  • <sex>_on_<sex> eg herm_on_female
  • <bodypart>_on_<bodypart> -- debatable, mainly hand_on_<bodypart>
  • <morphology>_on_<morphology> eg. anthro_on_feral
  • <familymember>_and_<familymember> -- but there are only a few (brother_and_sister, father_and_daughter, mother_and_daughter, mother_and_son, father_and_son) and this pattern doesn't include other tags that fit in that 'tag family' ;) (brothers, sisters, siblings)
  • Maybe any threesome (but not any moresome, nobody's gonna tag that, it would be too confusing to search.) <sex>_<sex>_<sex>_sex like you say.

There are other possibilities but they don't seem to have enough corresponding tags to justify them:

  • <bodypart>_in_<clothing> (hand_in_pocket etc)
  • <bodypart>_in_<bodypart> (finger_in_mouth etc)

BTW, on Linux (or Mac, or Windows with mingw, etc), if you have a list of all tags, "grep -Ee '_([a-z]{1,5})_' < mytaglist | less" will list the ones that might be two-parameter, ie. NOUN_CONNECTIVE_NOUN. All tags that conform to e621's usual tagging conventions should be picked up by this.
If you want to be really liberal I guess you could increase 5 to 9.

Updated by anonymous

savageorange said:

  • <morphology>_on_<morphology> eg. anthro_on_feral

*quietly writes note to self to use the word "morphology" for this concept from now on*

Updated by anonymous

I would somewhat expect two tables, to support a one-to-many mapping (first table defines basic info like id and name, second connects id with parameters)

You could do it with one (with multiple NULLable parameter type fields, param1 param2 param3 etc), but that is terrible and I'm afraid I would have to shoot you if you did that ;)

I thought about having 2 tables, but I found that having 1 table with 4 parameter columns (I don't think we should support having more than 4 parameters) is more optimal. This is because we're not dealing with 1-to-many but 1-to-4, and because we always need the parameters data. This eliminates one whole join operation with almost no redundant data. But you can shoot me if you want ;)

Doesn't that then mean that you have destroyed the grouping ability? Since you can only have one foo_bar tag on an image, if you attach everything to the character tag, then you have placed it all in one group, which is essentially the same situation we currently have, with the same searching problems.

Nope. to put it simply, searching for <male>rainbow_dash isn't same as searching for <male>character. Grouping by character means, making groups where key is a distinct character.

The scheme you proposed previously is close enough that it would work for most situations. Not close enough to be equivalent to tag grouping though (which would do this job in a way that is both simpler and more flexible than attempting to adapt parametrization to the job).

The actual grouping per character works with character "instances". My solution works with distinct character types. However, semantically they are the same. Think about it. This is because you cannot identify character instances when searching for tags. If an image has 3 rainbow_dashes on it, searching will never care about which of the 3 characters you're searching for, as long as one of the instances contains the search terms. If you search for female rainbow_dash and the image has 3 female rainbow_dashes, you cannot specify which one of the three you're looking for.

So I am proud to say that my proposed feature would work for *all* search cases, and that it's a more general, more robust, and more flexible solution than the actual character grouping per instance :)

There are other possibilities but they don't seem to have enough corresponding tags to justify them:

  • <bodypart>_in_<clothing> (hand_in_pocket etc)
  • <bodypart>_in_<bodypart> (finger_in_mouth etc)

I think the problem for such tags is that the amount of possible combinations is exponential. That's why users aren't tagging images, because they would need to create a new tag every time, a new tag that isn't really standardized, described or easily searchable. Users usually won't insert tags which they aren't sure exist or not. If they don't know that they can describe an image that way. But with generic tags like that, it would be easier than ever.

BTW, on Linux..

I don't want to create a whole test environment lol.. I'm not an admin.

Btw, what do you think about the following things:
<bodypart>penetrating<bodypart>
<male,female>doggystyle (since doggystyle implies sex, it gets parameters from sex)
<character,character>sex

Updated by anonymous

Delian said:

Btw, what do you think about the following things:
<bodypart>penetrating<bodypart>
<male,female>vaginal_penetration (since penetration implies sex, it gets parameters from sex)
<character,character>sex

This is kind of tangential to your ideas, but currently penetration doesn't imply sex as insertions like dildos or tentacles also count as penetration.

Updated by anonymous

Ah, yes, I forgot that sex is only when there's 2 characters sexually interacting. I was looking through sex tag description and I saw penetrations there so I thought those tags actually imply sex. Let me edit it then.

Updated by anonymous

Delian said:
I thought about having 2 tables, but I found that having 1 table with 4 parameter columns (I don't think we should support having more than 4 parameters) is more optimal. This is because we're not dealing with 1-to-many but 1-to-4, and because we always need the parameters data. This eliminates one whole join operation with almost no redundant data. But you can shoot me if you want ;)

I suppose exactly how terrible it would be would be dependent on what proportion of parametrizable tags actually have 3 or 4 parameters. So far it seems like they would be a minority, something like {1:60%, 2:35% 3:5% 4:1%}
.

The actual grouping per character works with character "instances". My solution works with distinct character types. However, semantically they are the same. Think about it. This is because you cannot identify character instances when searching for tags. If an image has 3 rainbow_dashes on it, searching will never care about which of the 3 characters you're searching for, as long as one of the instances contains the search terms. If you search for female rainbow_dash and the image has 3 female rainbow_dashes, you cannot specify which one of the three you're looking for.

So I am proud to say that my proposed feature would work for *all* search cases, and that it's a more general, more robust, and more flexible solution than the actual character grouping per instance :)

It's actually just a particular implementation of Char's grouping idea. which, as far as I can tell, you are mixing in with the generic-tag proposal :|

There is no actual character grouping per instance implementation in e621. Grouping is an idea that's been around for some time but hasn't been implemented.

I think the problem for such tags is that the amount of possible combinations is exponential. That's why users aren't tagging images, because they would need to create a new tag every time, a new tag that isn't really standardized, described or easily searchable. Users usually won't insert tags which they aren't sure exist or not. If they don't know that they can describe an image that way. But with generic tags like that, it would be easier than ever.

I think if you ask someone who does a lot of tagging and cleanups -- like Parasprite or Genjar -- they would be a lot more ambivalent about whether creating new tag permutations easily is a good thing.

Locating existing tags easily is less controversial -- there's really no reason that it wouldn't be good to help the user avoid typoes and mutated versions of existing tags.

I don't want to create a whole test environment lol.. I'm not an admin.

What?
I was specifying how I myself did the test with the data I already have, and how you could too if you happened to be running Linux.

(BTW, you don't need to setup a system to run Linux. LiveCDs are a thing.)

<male,female>doggystyle (since doggystyle implies sex, it gets parameters from sex)

Not sure what male,female is supposed to specify here. When I think about doggystyle, I figure that the 'bottom' can be any sex, and the top can be any sex that has a penis (so: male, herm, dickgirl). I'd hope that people can speak up if they want to be able to search this. My own opinion is that this is mostly not needed -- <sex>_on_<sex> + doggystyle should get results that are close enough.

<character,character>sex

I think basically everyone who does a lot of tag management would disagree with that, for the same reason you object to <bodypart>_in_<bodypart>.

Updated by anonymous

as far as I can tell, you are mixing in with the generic-tag proposal :|

If you can kill two problems with one stone, then why not. It's all a part of a greater masterplan ;)

I think if you ask someone who does a lot of tagging and cleanups -- like Parasprite or Genjar -- they would be a lot more ambivalent about whether creating new tag permutations easily is a good thing.

Hai Parasprite! Regarding <bodypart>in<bodypart>, do you think creating permutations like that (eg. tail_in_mouth) is a good or a bad thing? And why? So when you're tagging, and you see a distinct feature on an image, a feature that can only be described with <bodypart>in<bodypart>, and the tag for it doesn't exist yet, what would you do? Would you create a new tag for it? Or not? And why? Do you think generic tags would solve this problem or infinite permutations?

(<male, female>doggystyle)
Not sure what male,female is supposed to specify here. When I think about doggystyle, I figure that the 'bottom' can be any sex, and the top can be any sex that has a penis (so: male, herm, dickgirl). I'd hope that people can speak up if they want to be able to search this. My own opinion is that this is mostly not needed -- <sex>_on_<sex> + doggystyle should get results that are close enough.

That's interesting reasoning. But you should've been thinking in a more general way. Since doggystyle implies sex, it can be counted as "a type of sex". <male,female>doggystyle implies <male,female>sex. So tags only need to contain <male,female>doggystyle for the male/female search to work. It's like a hidden feature ;)

About the <bodypart>penetrating<bodypart>, we currently have a vaginal_penetration and other types of penetration. But vaginal_penetration basically means <anything>penetrating<vagina>. So instead of having all the penetration tags, we could simply have this one more powerful tag. This would allow users to search for <penis>penetrating<vagina> or <penis>penetrating<anything> (right now this cannot be search for) or <anything>penetrating<vagina>.

Updated by anonymous

Delian said:
That's interesting reasoning. But you should've been thinking in a more general way.

No dude, I just didn't see what <male, female>doggystyle was supposed to mean, at all, full stop.
Male and female don't seem like there are any options to parametrize them with. The only thing I could think of was 'has something to penetrate with' vs 'has an orifice'.

Your most recent reply suggests that you're not even talking about parametrizing them, despite writing them using the <parameter>_foo form.

Updated by anonymous

savageorange said:
No dude, I just didn't see what <male, female>doggystyle was supposed to mean, at all, full stop.
Male and female don't seem like there are any options to parametrize them with. The only thing I could think of was 'has something to penetrate with' vs 'has an orifice'.

Your most recent reply suggests that you're not even talking about parametrizing them, despite writing them using the <parameter>_foo form.

Ok, let's forget about this detail, it's not important.

Anyway, this discussion has been pretty productive. Do you think I should update original post and add some more info there? What should I add?

Updated by anonymous

Delian said:

Hai Parasprite! Regarding <bodypart>in<bodypart>, do you think creating permutations like that (eg. tail_in_mouth) is a good or a bad thing? And why? So when you're tagging, and you see a distinct feature on an image, a feature that can only be described with <bodypart>in<bodypart>, and the tag for it doesn't exist yet, what would you do? Would you create a new tag for it? Or not? And why? Do you think generic tags would solve this problem or infinite permutations?

Generally I try to see if the tag exists under a different name first, as sometimes taggers can come up with different names independently without realizing it (it's worth mentioning that we also have both tail_in_mouth and tail_nom).

If I can't find any sort of tag in use, it really depends on if there's enough need for that specific of a tag. When you think about the infinite number of x_in_y combinations there could possibly be (even just sticking with body parts), you need to back up and say "Hey, is this really worth it?". For instance, it doesn't make a whole lot of sense to tag things like elbow_in_mouth, tail_inside_butt, or penis_tentacles_inside_ears if nobody else is going to search for it or tag it. Is it more accurate tagging? Yes. Is it worth anybody's effort to tag like this? It depends.

Some things may be common enough to warrant that specific of a tag for it. For instance, condom_in_mouth, tail_sex, ear_penetration, etc. are all things that are being actively tagged. The first (condom_in_mouth) just happens to be a fairly common theme. The other two are being actively tagged, but they are also less specific. If you'll notice, tail_sex is less specific than tail_inside_butt, but still would help you narrow down searches all the same (just try searching tail_sex anal and you'll see what I mean). Likewise, things inside the ears is extremely uncommon, so it doesn't really make sense to get any more specific than that as it'll hurt searches (was that under ear_fingering, ear_fisting, or hoof_in_ear...). In fact, things inside ears are so uncommon that we don't even have an *_insertion tag for it, despite that being inconsistent with other similar tags.

Even if we had something like a dropdown menu to handle these combinations like this:

[  penis  ]_in_[    anus   ], [  prolapsed  ]

               [    ear    ]

               [░░░mouth░░░]◄-

               [   vagina  ]
                         

We would still need to severely limit the number of options in order for it to be practical for searching and efficient for tagging. Otherwise obscure things (i.e., shitting_dicknipples) would need to be accounted for and we'd have hundreds of options to scroll through, which would make tagging/searching a huge pain in the ass. Autocompletion would probably be better for this, but we would still need to do a lot of things manually (e.g., aliases) to concentrate tagging effort.

TLDR: It's hard enough to get taggers to tag anal_penetration instead of anal, or even just sex, and even the heavy taggers aren't going to want to spend 5 minutes describing each post with every exact detail.

Updated by anonymous

Oh, that's some nice info, thanks for your input. Let's see..

Some things may be common enough to warrant that specific of a tag for it.

So it's a very subjective matter, the decision whether to tag a feature on an image or not. If users would know that they already have a tags such as <bodypart>in<bodypart>, which they can use in any way to describe an image, so they would be 100% sure the tag already exists, would you say users would tag more things?

Likewise, things inside the ears is extremely uncommon, so it doesn't really make sense to get any more specific than that as it'll hurt searches (was that under ear_fingering, ear_fisting, or hoof_in_ear...)

That made me laugh hard ^^ But that's a very important observation. This tells us that, the more rare the feature is, the less specific and more obscure the tags become. So let me ask you this. Do you think generic tags would solve this problem? That is, remain accurate without hurting searches. (For instance, tagging <hand>in<ear> would allow us to find the image with queries <bodypart>in<ear> (ear_penetration) or <hand>in<bodypart> (fisting))

Even if we had something like a dropdown menu to handle these combinations ... we would still need to severely limit the number of options in order for it to be practical for searching and efficient for tagging.

Indeed, having too many options isn't practical. But fortunately there is a solution.
1. Sort the dropdown menu based on popularity.
2. Hide the rare elements until the user scrolls to the bottom or presses "Show All".
3. As you mentioned, include autocompletion.

This way the dropdown menu remains practical, since the most used tags will be on top, while retaining the option to show the full array and filter it as you type.

Mmm.. nothing like some shitting_dicknipples. But seriously, there's quite a few dicknipples posts :o. Anyway, we would need some sort of guideline what constitutes as a body_part.

It's hard enough to get taggers to tag anal_penetration instead of anal, or even just sex

Ok, another question. Do you think it would be easier (require less brain power) to tag <penis>in<vagina> or penis + vaginal_penetration.

Updated by anonymous

Delian said:
So it's a very subjective matter, the decision whether to tag a feature on an image or not. If users would know that they already have a tags such as <bodypart>in<bodypart>, which they can use in any way to describe an image, so they would be 100% sure the tag already exists, would you say users would tag more things?

Some would, most wouldn't. Generally people seem to tag within their preferred interests or fetishes but lack in other areas. For instance, someone may be interested in guns and machines might be really good at tagging things like holding_weapon, rifle, helicopter instead of just aircraft. Or someone interested in macro and dragons knowing when to use tags like crush, stomping, micro vs. macro, western_dragon, and the foot fetish-related tags, but on the other hand always seems to forget to tag penis or pussy.

However, even if they know which tags to use there's always going to be an upper limit on the number of tags people care about depending on how much they upload, how much tagging they do, etc. and I'd rather prioritize important tags (cub, gender, gender/gender combos, and other things that are highly likely to be blacklisted — scat, gore, hyper, MLP, castration, etc.) over tagging things like vaginal_penetration where...yeah it's helpful, but not if it means forgetting something important.

Do you think generic tags would solve this problem? That is, remain accurate without hurting searches. (For instance, tagging <hand>in<ear> would allow us to find the image with queries <bodypart>in<ear> (ear_penetration) or <hand>in<bodypart> (fisting))

Maybe. It seems fairly intuitive to me to work within a system like this, and we do already have analogues for most of these tags in some form or another (fisting -> vaginal_fisting, anal_fisting, urethral_fisting, etc.). It may be slightly more confusing for all the combinations that we would need to cover: vaginal_fingering -> <fingers>_in_<pussy>, vaginal_fingering -> <hand? fist?>_in_<pussy>, no equivelent -> <fingers>_on_<pussy>, etc.

I don't know though. It seems like it might look great on paper but wind up being too confusing even with a really good interface to work with. I can kind of picture bits of it but without a working model in front of me to actually play with and tweak it's kind of hard to make a judgement call on this.

Indeed, having too many options isn't practical. But fortunately there is a solution.
1. Sort the dropdown menu based on popularity.
2. Hide the rare elements until the user scrolls to the bottom or presses "Show All".
3. As you mentioned, include autocompletion.

This way the dropdown menu remains practical, since the most used tags will be on top, while retaining the option to show the full array and filter it as you type.

I hadn't thought of using a "show all" button; that might help make up for the lack of options. Either way it would probably need to be hand picked rather than automatically generated, or we'd need user-changable categories like orifice:[anal] and bodypart:[proboscis]...but I'm not sure on the details with this.

Mmm.. nothing like some shitting_dicknipples. But seriously, there's quite a few dicknipples posts :o. Anyway, we would need some sort of guideline what constitutes as a body_part.

Yeah, furries and Japan complicate this one every single time.

Ok, another question. Do you think it would be easier (require less brain power) to tag <penis>in<vagina> or penis + vaginal_penetration.

Probably about as much, or close enough at least. penis + pussy + male/female (or other combination) + sex seems about the simplest I've seen it being described while being at least somewhat accurate, even if it is less precise.

Updated by anonymous

parasprite said:
Some would, most wouldn't. Generally people seem to tag within their preferred interests or fetishes but lack in other areas.

Yes, people tag based on their interests. But that can also be because users only remember the tags relevant to their interest, and that's why they don't tag other tags. So I think we can easily assume that, in general, the more tags the user remembers, the more tags he will add to the post. And generic tags could make it much easier to remember tags because they would reduce the total amount of tags and because dropdown menus would allow the user to choose an option rather than having to remember it.

yeah it's helpful, but not if it means forgetting something important.

Well, we can't force the users to tag. However, what we can give them is greater / easier to remember tag knowledge in hope they will add more tags.

without a working model in front of me to actually play with and tweak it's kind of hard to make a judgement call on this.

Ugh.. I *could* make a working model. But I would need existing source code. And only if admins would be seriously considering adding this feature. I don't wanna waste a lot of time coding just for shits and giggles.

Either way it would probably need to be hand picked rather than automatically generated

For some tags (red_hair) it can be semi-automatically generated, but for body_part it would have to be manual.

penis + pussy + male/female (or other combination) + sex seems about the simplest I've seen it being described while being at least somewhat accurate, even if it is less precise.

So terms need to be simple. In my opinion vaginal_penetration is already a higher concept that goes against the "tag what you see" principle. While <penis>_in_<pussy> is exactly "what you see". When users tag, they don't think about what is implied, or what something means. They don't think about higher concepts like vaginal_penetration. They simply see a cock in a pussy :). In my opinion, even duo is too complicated term and should be 2_chars instead.

Updated by anonymous

Updated OP with discussion findings and added pros and cons.

Updated by anonymous

  • 1