Topic: TWYS vs text

Posted under General

After coming across this I decided to finally bring to question where we stand as far as tagging text in images. Is it appropriate to tag text based on what it appears to be rather than what it is?

Updated by notnobody

Text is ignored, except for text-based tags (profanity, for instance).

As for "appears", please be more specific; the example is in a language I do not understand so I can't give an answer.

Updated by anonymous

1. I can't read japanese. Can someone translate it?
2. No because liar's exist.

Updated by anonymous

Siral_Exan said:
Text is ignored, except for text-based tags (profanity, for instance).

As for "appears", please be more specific; the example is in a language I do not understand so I can't give an answer.

BlueDingo said:
1. I can't read japanese. Can someone translate it?
2. No because liar's exist.

In this case, the image containing chinese_text was tagged japanese_text. I feel like text shouldn't be tagged as being one language or another when the tagger doesn't know for sure.

Updated by anonymous

wous said:
In this case, the image containing chinese_text was tagged japanese_text. I feel like text shouldn't be tagged as being one language or another when the tagger doesn't know for sure.

Oh, I think your question was read wrong.
The text itself doesn't effect tagging, but if it's clearly chinese and not japanese, that's clearly false tagging which isn't accepted. If you see someone doing this, especially constantly, simply report the user and link to applicate posts.

Also I think many who see text that isn't western alphabet, they think of japan first. I would highly suggest for those to just simply tag text only. This way at least things aren't falsely tagged.

Updated by anonymous

Mario69 said:
Oh, I think your question was read wrong.
The text itself doesn't effect tagging, but if it's clearly chinese and not japanese, that's clearly false tagging which isn't accepted. If you see someone doing this, especially constantly, simply report the user and link to applicate posts.

Also I think many who see text that isn't western alphabet, they think of japan first. I would highly suggest for those to just simply tag text only. This way at least things aren't falsely tagged.

I can't say I've ever noticed a pattern over the years that steps outside of honest mistakes in this regard. Mistagging of this type is actually fairly rare, occuring once in twenty at most as I would estimate these days.

Updated by anonymous

Text is useful for corroborating the incest tag. There's always some judgement involved in applying that tag since TWYS can't confirm fictional genetic makeup.

Updated by anonymous

abadbird said:
Text is useful for corroborating the incest tag. There's always some judgement involved in applying that tag since TWYS can't confirm fictional genetic makeup.

Despite the title, that's not what this thread is about. It's about "If you don't know for sure what language the text in a picture is, is it ok to, for example, use Google Translate to determine the language and tag it as such?".

The main point of *_text tags is probably so translators can find those pictures, so tagging a probably correct language is likely better than no language tagged at all, I think...

Updated by anonymous

abadbird said:
Text is useful for corroborating the incest tag. There's always some judgement involved in applying that tag since TWYS can't confirm fictional genetic makeup.

That's still under debate. forum #209445

Updated by anonymous

Mario69 said:
if it's clearly chinese and not japanese

Aren't they mostly the same? That obviously doesn't apply with other languages, but it might be relevant in this particular example.

Updated by anonymous

Imuthes said:
Aren't they mostly the same? That obviously doesn't apply with other languages, but it might be relevant in this particular example.

They are definitely not the same. The Japanese writing may have adopted some of the Chinese characters but they have a visible difference in appearance for both of them.

To put it simply, chinese_text mostly looks more dense, blocky and look like they can fit in a box while japanese_text are less dense and smoother lines.

Updated by anonymous

I'd like to suggest unknown_language for this situation.

Updated by anonymous

Random said:
I'd like to suggest unknown_language for this situation.

+1 to this idea. an unknown language tag would be super useful for those unsure of what language they're tagging for, and can help for proper tagging by people searching posts with this tag

Updated by anonymous

DiceLovesBeingBlown said:
+1 to this idea. an unknown language tag would be super useful for those unsure of what language they're tagging for, and can help for proper tagging by people searching posts with this tag

It (unknown_language) already exists...

Updated by anonymous

Siral_Exan said:
It (unknown_language) already exists...

I'm not entirely comfortable with using that tag for languages that taggers don't know, because it also mixes in made-up or unnamed languages.

Updated by anonymous

Strikerman said:
I'm not entirely comfortable with using that tag for languages that taggers don't know, because it also mixes in made-up or unnamed languages.

It isn't my fault, and I'd vouch for language_request, in theme with the rest of our *_request tags.

Updated by anonymous

Why not just do ambiguous_text like we do with the various other ambiguous_* tags? It's not actually unknown - just an unknown one of a few possible options. And depending on the exact content, even a fully qualified person might not be able to definitively say which it is (say if half of the text in the example image were occluded and they still weren't using the simplified forms). It would be especially misleading in a case like this where the user's name has Japanese in it and then they have a Tumblr page in Chinese.

Updated by anonymous

Imuthes said:
Aren't they mostly the same? That obviously doesn't apply with other languages, but it might be relevant in this particular example.

The post I linked before has since been translated by TheGreatWolfgang (thank you for that), and I can tell you that I would never have thought that the first character was 玉. It's exactly the same between languages in terms of strokes, but there's a difference in the way it's written by hand. I might have only pulled that translation off if I started counting up through elevations that end in 952 on google, and even then I wouldn't have been sure.

A bit of trivia: I don't think 玉 even means hardly any of the same things in Chinese that it does in Japanese.

Updated by anonymous

wous said:
I might have only pulled that translation off if I started counting up through elevations that end in 952 on google, and even then I wouldn't have been sure.

The strokes were a bit off but I managed to identify it.

It helps to check the source and google translate sometimes for clues. The artist stated Taiwan in the source post, a quick google search for "mountains in Taiwan" and there you go!

Updated by anonymous

notnobody said:
Why not just do ambiguous_text like we do with the various other ambiguous_* tags? It's not actually unknown - just an unknown one of a few possible options. And depending on the exact content, even a fully qualified person might not be able to definitively say which it is (say if half of the text in the example image were occluded and they still weren't using the simplified forms). It would be especially misleading in a case like this where the user's name has Japanese in it and then they have a Tumblr page in Chinese.

If we're using other tags as examples, I'd say that not knowing what the text is matches unknown species (not sure what the species is supposed to be) more than ambiguous species (not enough details of the species are seen).

Updated by anonymous

I'm going on lack of clues to define an answer between a set of known options, versus loosely defined nonsense. Unknown would be similar to if you just had some fantastical newly-invented creature with no species name or other named examples, or seemingly random strokes that appeared to be intended to be language, but as far as anyone can tell, it's not a real one. Ambiguous would be if maybe some stylistic elements condensed a few different species' looks together such that you couldn't reliably tell them apart, or if some dead giveaway body part were occluded by something. For text, say you saw an example where there were just single characters were strewn around the page as labels for various things that appeared, and they happened to be the same ones in two languages. In that case, you could genuinely say it could be either one and that while you know which answers are possible, you just need more info to actually be correct. It's a bit of a fine line, but I think it's more true to the example.

Updated by anonymous

  • 1