Topic: Thumbnails are not always oriented correctly

Posted under Site Bug Reports & Feature Requests

It appears that metadata rotation (used mainly by cameras) doesn't get recognized by the code used to generate thumbnails.

Example: wolfyalex96 features a number of posts that are rotated incorrectly in the thumbnails but show up correctly when opened.

I assume the fix would require, at minimum, hunting down images that contain that metadata and applying it to their thumbnails as well.

Does e621 read thumbnail data embedded in the file (eg. JPG supports this) to generate some thumbnails? If so, it's probably literally that the image itself was rotated (and the rotation tag adjusted accordingly) without also rotating the thumbnail data to match.

bad_metadata MAY apply, but the definition of bad_metadata should probably be made a lot more clear (eg. wrong metadata-as-displayed-by-e621 can be for reasons other than the file actually containing wrong metadata. Take one of the examples cited on the wiki page, 'md5 mismatch' -- barring actual malicious files, md5 mismatch should only occur when e621's md5sum calculation fails. A file can't 'give incorrect information' about its md5sum.). As it stands it just lists symptoms, but it should be defined whether the tag is for when there is an identified problem with the *content* of the file, or just e621's *handling* of the content of the file.

errorist said:
I assume the fix would require, at minimum, hunting down images that contain that metadata and applying it to their thumbnails as well.

We won't (can't) apply the metadata to their thumbnails, the jpeg rotation metadata should be cleared and the image itself should be rotated, and that should then be submitted as a replacement for the original file
We do this decently often

dba_afish said:
I believe that this is, in part, what bad_metadata is for

This isn't really what that tag is for, the posts should be fixed, not tagged with something

donovan_dmc said:
We won't (can't) apply the metadata to their thumbnails, the jpeg rotation metadata should be cleared and the image itself should be rotated, and that should then be submitted as a replacement for the original file
We do this decently often

This isn't really what that tag is for, the posts should be fixed, not tagged with something

I thought that bad_metadata was supposed to be used to mark stuff to be fixed

dba_afish said:
I thought that bad_metadata was supposed to be used to mark stuff to be fixed

Despite what the wikipage says I doubt any of that can or will be fixed ever, the majority of it (incorrect md5, invalid dimensions, bad filesize) would require direct database access

donovan_dmc said:
Despite what the wikipage says I doubt any of that can or will be fixed ever, the majority of it (incorrect md5, invalid dimensions, bad filesize) would require direct database access

I vaguely remember either a bot or admin regularly going through and fixing for some reason, but that does not seem like it was ever actually a thing. I might be misremembering seeing it being fixed from report tickets or something...

alphamule

Privileged

It feels like relying on JPEG files to provide a thumbnail is asking to be trolled with a misleading thumbnail. I'm actually kind of surprised.

Oh no, I just realized it's worse than I thought. It's not just the thumbnails; it's the scaled-down-to-save-bandwidth copies too. I didn't realize it because I was logged in at the time.

Besides rotation, what other metadata might be in images that would skew the results of an auto-downscaler if it's not copied into the resulting file? Colorspace, I guess.

donovan_dmc said:
We won't (can't) apply the metadata to their thumbnails, the jpeg rotation metadata should be cleared and the image itself should be rotated, and that should then be submitted as a replacement for the original file
We do this decently often

I meant it's something that the coding team will need to write a script to do. You know, scan every image in the database for ones that have a rotation flag, then pull up the associated downscaled copies and paste the flag into them. And then, to prevent that from having to be run again, update the code that generates those copies so it does the check automatically for all incoming images.

errorist said;
Besides rotation, what other metadata might be in images that would skew the results of an auto-downscaler if it's not copied into the resulting file? Colorspace, I guess.

We use sRGB for the colorspace, and I'm sure that's not changing any time soon.

errorist said:
I meant it's something that the coding team will need to write a script to do. You know, scan every image in the database for ones that have a rotation flag, then pull up the associated downscaled copies and paste the flag into them. And then, to prevent that from having to be run again, update the code that generates those copies so it does the check automatically for all incoming images.

Uh, no. No that's not something we can easily do. You're talking about combing through millions of images which is no easy task, then regenerating possibly tens of thousands of thumbnails which is also not an easy task. What's easier is to just remove the metadata when we see it and replace the original image.

FYI our "team" is one unpaid volunteer, I'm sure anything about this is so low on his priority list that it may as well not be on there.

errorist said:
Oh no, I just realized it's worse than I thought. It's not just the thumbnails; it's the scaled-down-to-save-bandwidth copies too. I didn't realize it because I was logged in at the time.

In that case I would suspect that the rotation flag on the image itself, not the thumbnail, is incorrect. And part of the toolchain (probably the part that is generating samples and thumbnails) is respecting that flag, and part of the toolchain is ignoring it (probably the browser is ignoring it).

That makes more sense to me than my earlier guess, as images containing embedded thumbnails is not something you can really count on (but, you know, sometimes your support systems turn out to be more clever than you thought, in a way that causes you problems..)

savageorange said:
In that case I would suspect that the rotation flag on the image itself, not the thumbnail, is incorrect. And part of the toolchain (probably the part that is generating samples and thumbnails) is respecting that flag, and part of the toolchain is ignoring it (probably the browser is ignoring it).

It's the opposite, the original has rotation data which the browser is respecting, that rotation data is not preserved when generating thumbnails and samples
The actual image is sideways but there's some data in the image that says to rotate the image, works just fine for the original image but breaks down anywhere else
I'm feeling like a broken record at this point, but the original image should have its rotation data cleared and the actual image should be rotated, then the post here should be replaced with that fixed version

Unfortunately that can't be done to a JPEG without losing quality. I think that's the main reason rotation metadata exists; someone got sick of not being able to rotate JPEGs losslessly.

errorist said:
Unfortunately that can't be done to a JPEG without losing quality. I think that's the main reason rotation metadata exists; someone got sick of not being able to rotate JPEGs losslessly.

It's very much possible to rotate jpegs without losing any quality
I took post #2408095
The original is 178,197 bytes
Rotated 180 is 178,184 bytes
Rotated 180 again (from the originally rotated post) is 178,197 bytes

There is zero data loss happening here
The amazingly complicated commands to do this

This isn't jpeg rotation data, this is actually rotating the jpeg. I uploaded it into a local e621 instance and the samples are generated in the same rotation as the image, proving it isn't exif rotation. No data was lost. This is not hard.

Watsit

Privileged

errorist said:
Unfortunately that can't be done to a JPEG without losing quality. I think that's the main reason rotation metadata exists; someone got sick of not being able to rotate JPEGs losslessly.

The way JPEG is encoded in blocks should be possible to rotate in increments of 90 degrees without loss. It can't be rotated arbitrarily without a lossy reencode, but when rotating by 90 degrees, it's making lossless modifications to the compressed image data itself. Presuming at least that the app doing the rotation knows how to/can do it that way rather than a naive decode, transform, reencode method.

watsit said:
The way JPEG is encoded in blocks should be possible to rotate in increments of 90 degrees without loss. It can't be rotated arbitrarily without a lossy reencode, but when rotating by 90 degrees, it's making lossless modifications to the compressed image data itself. Presuming at least that the app doing the rotation knows how to/can do it that way rather than a naive decode, transform, reencode method.

man jpegtran

:

-flip horizontal
              Mirror image horizontally (left-right).

       -flip vertical
              Mirror image vertically (top-bottom).

       -rotate 90
              Rotate image 90 degrees clockwise.

       -rotate 180
              Rotate image 180 degrees.

       -rotate 270
              Rotate image 270 degrees clockwise (or 90 ccw).

       -transpose
              Transpose image (across UL-to-LR axis).

       -transverse
              Transverse transpose (across UR-to-LL axis).

(I think this list covers every lossless transform that is in principle possible)

the jpegtran man page makes no mention of the Orientation tag, although it provides options to preserve such information (-copy all, which notably also includes embedded thumbnails). A proper 'lossless' conversion should probably at least use -copy icc.

This is all somewhat academic though, as automatic determination of what orientation is 'correct' is AFAIK not possible; the best you can do is identify that the thumbnail and image are differently oriented (via comparing dimensions or brute force).

Just like with color management, the only viable answer is for the artist to just do the correct thing to begin with, which for our purposes is probably to rotate the actual image, update thumbnail accordingly, and set the Orientation to 1 (unrotated). They'll probably want to do that through a GUI but it can be done via just a combination of jpegtran and exiv2.
Phone gallery apps provide rotation (which I think is lossless), usually through the cropping interface, but I can't really comment on whether they do something that is from our point of view sensible with thumbnails or Orientation tag.

Interesting. I would have assumed JPEG insists on aligning its blocks to the top left, making lossless rotation impossible unless both dimensions are divisible by 8.

Anyway, assuming someone was willing to start doing this to any tag-rotated JPEGs they find, would this be a viable use case for the replacement feature? The only downside I can think of is that the resulting file would probably break the find-source and detect-duplicate features.

alphamule

Privileged

The magic of matrix transformations!

It should shock no math students who ever took linear algebra, that you can also do reflections (flips), or that indeed, you can do anything reversible that operators like T (transpose) and multiplying by [[0 1][1 0]] would do to any other array (like say, bitmaps).
Yeah, to get arbitrary angles you have to do that dirty cosine rotation that is lossy. :( Basically the blocks in JPEG are conserved losslessly if your new positions of the pixels are exactly on the existing grid (which breaks if not 90 degree multiples or reflections). If not, then you have to create a new grid.

errorist said:
Interesting. I would have assumed JPEG insists on aligning its blocks to the top left, making lossless rotation impossible unless both dimensions are divisible by 8.

Anyway, assuming someone was willing to start doing this to any tag-rotated JPEGs they find, would this be a viable use case for the replacement feature? The only downside I can think of is that the resulting file would probably break the find-source and detect-duplicate features.

There's actually a problem when your image has too many rows or columns and you have an edge. I love this explanation: https://www.betterjpeg.com/lossless-rotation.htm

An interesting question is could you do color or brightness manipulations in a lossless way. Exact same frequency and phase information, but entire image is swapped red for green, for example. JPEG's lossy process was aimed at eliminating useless parts of the FFT map (Yes, I know that JPEG uses DCT ).

  • 1