Topic: Questions about replacing Twitter uploads in regular or "large" format with "orig" versions

Posted under General

There are many posts on e621 that have been uploaded exclusively to Twitter by the original artists, with no other public means of accessing them. Of these uploads, many do not have a higher-resolution variant available. This limits us to determining if the basic, "large", or "orig" source was used by manually checking for compression artifacts, cross-referencing file sizes, or using some kind of userscript.

After some digging, I was unable to find specific information on the Sites and Sources page, the expanded section for Twitter, or prior forum threads regarding a few questions I had about the upload-and-flag process. The questions I had were:

1) Should we replace standard and "large" versions with "orig" versions of Twitter uploads of the same resolution, even if there are little to no visual differences?

2) If 1 is desired, does the "orig" version still count as superior if the file size is smaller than the standard or "large" versions?

3) Is it worth the time and effort to bother with this kind of undertaking? Does the added load on the flag handling queue outweigh the minor benefits to archival?

An example of question 1 and 2 in practice is post #1938751. The resolution is 990x765 and the file size is 83.8 KB. This file size matches the basic and "large" versions at the Twitter source, and the images are visually identical. However, the "orig" version is 72.4 KB large and has almost imperceptibly less artifacting when inspected closely.

In the above example, which actions would be ideal? Should the current version be left alone, or should it flagged and replaced with the "orig" JPG since that is technically most similar to the original? Again, I ask due to the frequency of this type of situation. Thank you for your time.

Updated by Mairo

https://e621.net/wiki/show/howto:index
https://e621.net/wiki/show/howto:sites_and_sources_-_n_to_z#twitter

:orig from twitter is the best URL addition, it reverts the post to "original" upload quality. I use italics as twitter ruins most everything with compression artifacts. File size doesn't = quality. It's not a defining factor in post quality.
Though the visual between large and orig might be invisible to the eye, there can still be compression artifacts that you just can't see which from what I heard can be found using sites like gimp to compare them,

If you ever see a twitter with :large and you can't see any difference, try using :orig direct upload anyways. If the post gets md5 matched and prevents your upload you'll know for sure that there was no difference, if it gets threw than you posted a better version.

Updated by anonymous

Strongbird said:
1) Should we replace standard and "large" versions with "orig" versions of Twitter uploads of the same resolution, even if there are little to no visual differences?

2) If 1 is desired, does the "orig" version still count as superior if the file size is smaller than the standard or "large" versions?

3) Is it worth the time and effort to bother with this kind of undertaking? Does the added load on the flag handling queue outweigh the minor benefits to archival?

1. Filesize is irrelevant information. Also this is where things get complicated.

If the filetype with orig and other versions is PNG, then that usually means all versions to be visually the same, so if resolution is the same, post should not be replaced.

More commonly however the filetype is JPG. If resolution is below 2048px in dimension, large and orig are usually (not always) exactly identical which can also be perceived with matching MD5 hash. If that's the case, then replacing the source on post to match orig is all that's needed. However if the resolution is above 2048px or if it is below 1200px and sample size is medium instead of large, then there is slight difference and post should be replaced with orig version.

2. Refer to answer 1. In many cases with JPG, resaved version can actually be bloated to the point it's much higher filesize from original copy while also being inferior quality, hence why in many cases higher filesize is indication of inferior version.

3. I would say it's worth it. We are still resource where people find their content and as time goes on, we are still archive where the copies of vanished content can be obtained, as such even if the difference is ever so slight, I would deem it as worth it to minimize the effect of generation loss. If there is a sudden spike with flags, I can try to get some time to deal with them. Alternatively you could poke someone from staff to get janitor status to be able to delete posts yourself, in this case the status could be primarily given for post replacement rather than approvals as you seem to have general knowledge of what the fuck you are doing (altough this is just my idea, so this isn't promise or anything). This would effectively free time of janitors to handle other flags and approvals instead of mindlessly clicking delete on flags from you.

Also in general, I would like there to be insentive for users to do shit correctly themselves, instead of assuming that what they are doing is correct way of handling things, because there will already be cases where someone needs to fix their mistakes. This is why the wiki exsists, this is why I wrote on the wiki that orig is always preferred, to avoid shit from happening and lessen work of others. Even if large and orig happen to be identical in some cases, those still show up if trying to find samples with source: search and it will be russian roulette if user happened to upload file matching orig.

Strongbird said:
An example of question 1 and 2 in practice is post #1938751. The resolution is 990x765 and the file size is 83.8 KB. This file size matches the basic and "large" versions at the Twitter source, and the images are visually identical. However, the "orig" version is 72.4 KB large and has almost imperceptibly less artifacting when inspected closely.

In the above example, which actions would be ideal? Should the current version be left alone, or should it flagged and replaced with the "orig" JPG since that is technically most similar to the original? Again, I ask due to the frequency of this type of situation. Thank you for your time.

Difference is actually pretty huge, here's boolean type image comparison, pure black areas are identical between the images: https://puu.sh/EgFpn/58c013ccb5.png
If you don't know how to utilize stuff like photoshop or gimp for visual image comparison, idems tool actually has simpler tool with just URL fields and compare button and visually better comparison between two images: forum #270739

Also as such, this is one of the cases where it's clear how resaving of JPG file not only worsens the quality, but also bloats the filesize, as the compressions stack and it takes more information to try to maintain the quality of already compressed material.

In that case, I would upload orig version and get rid of that post as inferior.

Also as reminder that e621 goes by visual quality with posts, rather than MD5 hashes or if the difference is seen by users. If the copy is visually closer to artists original copy, it's superior.
e621:image_quality

Updated by anonymous

  • 1