Topic: A suggeston for when uploading

Posted under General

So I've noticed this in and around the site that some images are flagged for deletion because they are a repost of another image but I had suggestion to maybe stop this that when uploading the site does a search for a same web address or URL's on all images and a message will appear saying something like "Error this content file has already been uploaded" and stop the upload I'm not sure if this can be done but I'm only putting it out there as a thought...

Updated by Rainbow Dash

thatoneclarinetist said:
So I've noticed this in and around the site that some images are flagged for deletion because they are a repost of another image but I had suggestion to maybe stop this that when uploading the site does a search for a same web address or URL's on all images and a message will appear saying something like "Error this content file has already been uploaded" and stop the upload I'm not sure if this can be done but I'm only putting it out there as a thought...

It already does this if its the exact same url.

Updated by anonymous

The server does runs an md5 check on all uploads, if a match is found it simply redirects to the existing post (and possibly adds all tags you added to the upload to the post).

Updated by anonymous

Moon_Moon said:
It already does this if its the exact same url.

If it did then the problem wouldn't exsist and I guess this could apply for already deleted images also so trolls or whatever can't KNOWINGLY upload deleted images

Updated by anonymous

Char

Former Staff

thatoneclarinetist said:
If it did then the problem wouldn't exsist and I guess this could apply for already deleted images also so trolls or whatever can't KNOWINGLY upload deleted images

As NotMeNotYou mentioned, the site performs an MD5 check when a user tries to upload a new image. This means that if an identical image file is found to have already been uploaded to the site, it won't upload the image. This is a crude form of duplicate checking, because if the two files are different in any way at all (different file types, sizes, compression settings, etc), then the MD5 hashes for the files won't match, and the site won't realize that an "identical" image already exists.

The problem is that if we want to take it a step further so that the site recognizes an image by its contents rather than its MD5 hash, we have to setup and maintain image-matching software (think Tineye, IQDB, reverse Google image search, etc), and this is not an easy task for us at the moment. The system also isn't going to be perfect, and so we'd need some way of verifying the "matches" for images that it finds when users try to upload new posts. Sketches and other artwork with very low detail tends to trip up systems like this a lot, which e621 has plenty of.

We're still hopeful that we're going to eventually have a system like this in place, but hopefully this at least explains why it's not as simple as we'd all like for it to be.

Updated by anonymous

Also, I really explained this a day ago on the feature request thread as well

Updated by anonymous

  • 1