Topic: Using Waybackmachine/Internet Archive as an image source?

Posted under General

I know that images may be removed if the artist wants them removed and such, but is there anything against taking an image from the past that was otherwise inaccessible?

This website in particular https://archive.org/

As far as I can tell, they aren't images the artist necessarily wanted gone, but were otherwise lost to time, such as the mass Tumblr Apocalypse.

Updated by Mairo

Still put the actual proper source in, even if it is 404.
It is up to users themselves to use whatever archiving service they desire with the URL that is provided in the source field.

I see this as pretty similar approach to stuff like upscales - they have their porpose, but it's not our purpose. Our purpose is to provide the original content so that users can then do whatever on their own computers, our purpose is to provide the user with original content instead of already handled one.

Also as side note, tumblr doesn't delete images, so even if the blog post or even the blog itself is gone, you can still access images as long as you have direct URL to it. Reblogging also works so if you do manage to get reblog link with archival services it should work.

Updated by anonymous

Pup

Privileged

Mairo said:
Still put the actual proper source in, even if it is 404.
It is up to users themselves to use whatever archiving service they desire with the URL that is provided in the source field.

Couldn't you include the actual image as the source, like you said, then put the https://archive.org link as a second source, so people can still see/find it?

Updated by anonymous

Pupslut said:
Couldn't you include the actual image as the source, like you said, then put the https://archive.org link as a second source, so people can still see/find it?

That's pretty much how I'd do it. The Wayback Machine is a sourxe even if not a primary one. I think I've used it as a source a few times when uploading otherwise lost pictures to e621.

Updated by anonymous

Mairo said:
Still put the actual proper source in, even if it is 404.
It is up to users themselves to use whatever archiving service they desire with the URL that is provided in the source field.

I see this as pretty similar approach to stuff like upscales - they have their porpose, but it's not our purpose. Our purpose is to provide the original content so that users can then do whatever on their own computers, our purpose is to provide the user with original content instead of already handled one.

Also as side note, tumblr doesn't delete images, so even if the blog post or even the blog itself is gone, you can still access images as long as you have direct URL to it. Reblogging also works so if you do manage to get reblog link with archival services it should work.

I'd argue that the archive is technically the only way to get the 'original content' plus simply adding a dead link makes no sense to me.

Updated by anonymous

PheagleAdler said:
I'd argue that the archive is technically the only way to get the 'original content' plus simply adding a dead link makes no sense to me.

At least in wayback machine case, the original URL is still retained in archived version. However there are also archive websites which do not do this where you can use original URL to search if it's archived or not.

Regardless my point still stands, original URL even if 404'd, is still the most valuable form of source. This is the URL you pretty much need to use with archiving websites.

Also with tumblr case, what if the pornhub does actually buy the whole site and release all explicit blogs all of a sudden? It would be kinda massive undertaking to go around all posts with wayback machine links and remove those. I can script tags but I cannot script sources and those might also need bit more manual oversight.

This just all feels as pointless to me as linking to google search result as source, it's pointless as the search terms are the ones that actually matters and at that point you can even use duckduckgo if you want to.

Also we did actually have this discussion on discord the other day: sources job is to point out where the content is from, it's not there to make users life more convinient. Wayback machine is still third party maintained copy of the original source at which point even e621 itself is as good source.

Updated by anonymous

  • 1