Topic: Need help identifying software that is causing this file embedding on these posts

Posted under General

post #1515713 post #1490946 post #1357160
This is more of a heads up than than anything. Basically what's going on is whatever software these artists are using is embedding an mp4 and an alternate version of the picture all with the same name and location formats\living\living.mp4 living.jpg and an xml file into the picture. You should be able to open these posts with 7zip and see. I'm convinced it's whatever software they are using to draw as the videos are only a few seconds long and of brush strokes.

I found one in the past and flagged it for file embedding but it wasn't deemed malicious and left alone, but lately I've found a few more and will probably continue to. What's most concerning is because the first one contains an uncensored version of the picture, and being from prixi and probably Japan they're might be some legal ramification for them. I don't know how feasible it would be to reach out and find out what software they are all using and warn them about it. The middle post seems to be a member here and might be able to be reached more efficiently by staff.

Also if the posts are still OK to be on the site (and if/when I find more) if there should be a some kind of meta tag like embedded_file or just add a note to the description? I'm not quite sure how it's being embedded so I don't know if the sites uploader could detect it. There is appended data starting from where the picture data ends but that can't be opened with 7zip alone like I normally can do in these situations. If we can find out what software is causing this, and the very least try to educate artists that use it so something actually major doesn't happen from it (theoretical example uncensored patreon version inside a censored one)

Updated by alphamule

these pics seem to be same deal as live photos, but perhaps broken. thats all i got.

Updated by anonymous

If it were just the censored/uncensored images, I'd think the artist was doing it on purpose: It's a way to officially censor your images, while sharing the uncensored one. It's possible that they are doing it that way, and simply do it with everything so that if they get caught they can claim they didn't know.

Updated by anonymous

Ruikuli said:
these pics seem to be same deal as live photos, but perhaps broken. thats all i got.

doing some more research I think i figured it out after reading through this MS answers, it's apart of Windows 10 called living photos . I'm not running windows 10 so I can't test these posts out with the app, but if someone could test them out, and they do work, then maybe just add the tags living photo and maybe animated?

Updated by anonymous

pc-king said:
post #1515713 post #1490946 post #1357160
This is more of a heads up than than anything. Basically what's going on is whatever software these artists are using is embedding an mp4 and an alternate version of the picture all with the same name and location formats\living\living.mp4 living.jpg and an xml file into the picture. You should be able to open these posts with 7zip and see. I'm convinced it's whatever software they are using to draw as the videos are only a few seconds long and of brush strokes.

I found one in the past and flagged it for file embedding but it wasn't deemed malicious and left alone, but lately I've found a few more and will probably continue to. What's most concerning is because the first one contains an uncensored version of the picture, and being from prixi and probably Japan they're might be some legal ramification for them. I don't know how feasible it would be to reach out and find out what software they are all using and warn them about it. The middle post seems to be a member here and might be able to be reached more efficiently by staff.

Also if the posts are still OK to be on the site (and if/when I find more) if there should be a some kind of meta tag like embedded_file or just add a note to the description? I'm not quite sure how it's being embedded so I don't know if the sites uploader could detect it. There is appended data starting from where the picture data ends but that can't be opened with 7zip alone like I normally can do in these situations. If we can find out what software is causing this, and the very least try to educate artists that use it so something actually major doesn't happen from it (theoretical example uncensored patreon version inside a censored one)

The first picture is made by an artist who isn't Japanese, yet he/she posted it on Pixiv, a Japanese site. Clearly, I have no word.

Updated by anonymous

cerberusmod_3 said:
The first picture is made by an artist who isn't Japanese, yet he/she posted it on Pixiv, a Japanese site. Clearly, I have no word.

???

Updated by anonymous

cerberusmod_3 said:
The first picture is made by an artist who isn't Japanese, yet he/she posted it on Pixiv, a Japanese site. Clearly, I have no word.

That is entirely off topic. That is not the point.

and you don't need to be japanese to use a japanese site.

Pixiv officially supports -- scroll down to the bottom of any pixiv page, in the bottom right--Japanese, English, Korean, Simplified Chinese and Traditional Chinese.

We don't act confused when someone who is german uses an english website, or when Chinese people use deviant art. People can use whatever websites they like.

So, shhh. That is not the point of this topic. the point of this topic is embedded files.

Updated by anonymous

SnowWolf said:
That is entirely off topic. That is not the point.

and you don't need to be japanese to use a japanese site.

Pixiv officially supports -- scroll down to the bottom of any pixiv page, in the bottom right--Japanese, English, Korean, Simplified Chinese and Traditional Chinese.

We don't act confused when someone who is german uses an english website, or when Chinese people use deviant art. People can use whatever websites they like.

So, shhh. That is not the point of this topic. the point of this topic is embedded files.

Well, it's just that only the Japaneses should put censors on genitals due to the law they have on their country.

Updated by anonymous

cerberusmod_3 said:
Well, it's just that only the Japaneses should put censors on genitals due to the law they have on their country.

Censorship

... As I've said before, some people LIKE censorship marks, just like some people like underwear that lets the genitals be seen.

According to wikipedia, In South Korea "Pornographic websites, books, writings, films, magazines, photographs or other materials of a pornographic nature are illegal in South Korea, although the law is not regularly enforced. Distribution of pornography can result in a fine or a two-year prison sentence. Since 2009, pornographic websites have been blocked by the South Korean government."

Also, This thread is not about censorship, Korea, Japan, laws or politics. It's about the embedded files within the posts in question.

If you would like to continue discussing this, I would suggest leaving a comment on the picture, instead of in a forum thread about a completely different thing.

Updated by anonymous

cerberusmod_3 said:
Well, it's just that only the Japaneses should put censors on genitals due to the law they have on their country.

you do realize that pixiv states in site rules that uncensored genitalia are not allowed. so regardless of where the user is from, they need to follow pixiv's rules if they want to keep posting there.

seriously what is with you and censoring? you keep bringing it up every second thread you post on regardless what the thread is about.

Updated by anonymous

Yeah, that's definitely some specific program embedding this on purpose. I would not be opposed to simply having new tag for this. embedded_file file sounds good for me, so just use that. Wiki can be written about this as well, because we do not want to have posts with embedded data on them like this for hopefully obvious reasons, but if it's software inserting it during export it's not exactly malicious or hurting.

Furrin_Gok said:
If it were just the censored/uncensored images, I'd think the artist was doing it on purpose: It's a way to officially censor your images, while sharing the uncensored one. It's possible that they are doing it that way, and simply do it with everything so that if they get caught they can claim they didn't know.

In this case the resolution and quality difference are immense, partially reason why nobody even though there would be extra data on the post because much lower quality copy does not raise filesize that much.

Updated by anonymous

Mairo said:
Yeah, that's definitely some specific program embedding this on purpose. I would not be opposed to simply having new tag for this. embedded_file file sounds good for me, so just use that. Wiki can be written about this as well, because we do not want to have posts with embedded data on them like this for hopefully obvious reasons, but if it's software inserting it during export it's not exactly malicious or hurting.

After creating a windows 10 VM and trying to get the "living images" to work without success I did some more digging. It looks like it was a feature with windows 8.1 and windows mobile but is no longer supported in the photo app. It look like there was an update that might work on win 10 but it requires some conversion if the photo was reated in windows 8.1 or an older version of windows mobile. Some other files that where created internally are also needed for this to work as far as I can tell, I might try and get a copy of windows 8.1 working to find out later. Someone mentioned in a 2015 post he was working on a program to do it in c# but that was the last thing he posted. Based on the top replyhere and here I can almost guarantee that these are "living images" probably made on windows 8.1 sketch pad or whatever they are calling it now, as it's described almost identically. I suggest maybe a wiki on living images and calling the tag living_images as well since that's what it is, and holding off on the animation tag because windows doesn't seem to support it anymore at least not natively
":

hsauq said:
I'm genuinely curious: How did you discover that these images had embedded files? What made you check?

I do active scans with my some custom scripts looking for anything embedded in any picture I save off the internet. It's kinda turned into a bit of a weird hobby/obsession at this point, that started when I found out that, back when I use to frequent /g board on 4chan, a lot of stuff was embedded secretly into pictures. Lucking I didn't find anything illegal just some weird stuff and gore, but some did, and I've been paranoid ever since. I've actually found some interesting things here in the past like people hiding DNP payed stuff in pictures and stuff like this https://e621.net/ticket/show/47183. It's also kind of neat to see any easter eggs some artists put at the end of the file like either their name, or sometimes I find poems, music lyrics etc.

Updated by anonymous

Welp, I'll start working on a program, not sure about 7Z, but winrar has a quick cmdline process for this. Should be quick enough as this seems simple.

Updated by anonymous

Imanton1 said:
Welp, I'll start working on a program, not sure about 7Z, but winrar has a quick cmdline process for this. Should be quick enough as this seems simple.

7zip has a command line too here is part of one of the scripts I use. you can just save this as scanner.bat from notepad

for %%x in ("%USERPROFILE%\Desktop\livephoto\") do set WTFDIR=%%~sx
mkdir %WTFDIR%
for /r %%x in (*.png *.jpg *.jpeg *.gif) do "C:\Program Files\7-Zip\7z.exe" x %%~sx -o%WTFDIR% -r -aou >> 7zip_log.txt

this will create a folder on desktop and place any finding in it. If you installed 7zip at someplace other than default you might need to change that location. Run the bat script in folder with all the pictures you want to search. I recommend installing 7zip 9.20 as you can just search the log for extract and will show you what picture it was successful in, the newer versions add extracting failed to extract to ever attempt and it makes it more difficult to search the log. here is the python script I use as well to look for any data at the end after the picture data stops if anyone is really that interested, but you'll need to install some kind of hexeditor like HxD just to see what kind of data you are looking at, and you might want to familiarize yourself with file headers to know if your are actually looking at something interesting or just corrupted data http://pastebin.com/raw.php?i=L3V9HxFh

Updated by anonymous

Ruikuli said:
you do realize that pixiv states in site rules that uncensored genitalia are not allowed. so regardless of where the user is from, they need to follow pixiv's rules if they want to keep posting there.

seriously what is with you and censoring? you keep bringing it up every second thread you post on regardless what the thread is about.

Well, things get complicated sometimes when it comes to posting porn.

I still can't get over the fact that people on the Internet still gets pissed whenever censored stuff, particularly those made in Japan is around.

Updated by anonymous

I'm kinda surprised services don't already strip these things on upload.

Updated by anonymous

pc-king said:
I recommend installing 7zip 9.20 as you can just search the log for extract and will show you what picture it was successful in, the newer versions add extracting failed to extract to ever attempt and it makes it more difficult to search the log. here is the python script I use as well to look for any data at the end after the picture data stops if anyone is really that interested, but you'll need to install some kind of hexeditor like HxD just to see what kind of data you are looking at, and you might want to familiarize yourself with file headers to know if your are actually looking at something interesting or just corrupted data http://pastebin.com/raw.php?i=L3V9HxFh

Why do you need to scrape the STDOUT? The error codes are faster and probably more reliable. Besides that, 7Z and WR have identical timing (over 6600 files) and the same command line arguments, so potato potato to whatever you use.

As for the headers, that's pretty much what I do in my spare time, but you got the fun of writing the python code first, so hats off to you.

Edit:
Also, yes, I'm surprised since it only takes a quarter-second server-side on upload to scan for these, at least for PNG/JPG images.

Updated by anonymous

Imanton1 said:

As for the headers, that's pretty much what I do in my spare time, but you got the fun of writing the python code first, so hats off to you.

Sorry I can't take credit for the python script, Someone else wrote it back in 2013-14 something like that as a joint effort to find anything that was embedded in people's archives from different chan sites.

Updated by anonymous

banhday said:
I'm kinda surprised services don't already strip these things on upload.

It's not that easy to know what's safe to strip. Even assuming you had a perfectly accurate system, you still would have to fully parse large chunks of the file in order to know what was bogus. That would include testing binary attachments that are normally legitimate valuable parts of the file (eg. ICC profile) and tracking the length of the file (one common exploit is simply appending a second file to the end of an image file)

TL;DR: hard to do correctly, slow, and usually wouldn't be worth the time necessary to implement it. Also a moving target -- there are always more exploits to be found.

The most reasonable scenario IMO would be simply detecting disproportionate filesize (Image reports X*Y pixels, filesize is significantly bigger than the estimated worst-case encoded size of that amount of pixel data) and flagging the image for manual embed checking.

Updated by anonymous

savageorange said:
It's not that easy to know what's safe to strip. Even assuming you had a perfectly accurate system, you still would have to fully parse large chunks of the file in order to know what was bogus. That would include testing binary attachments that are normally legitimate valuable parts of the file (eg. ICC profile) and tracking the length of the file (one common exploit is simply appending a second file to the end of an image file)

TL;DR: hard to do correctly, slow, and usually wouldn't be worth the time necessary to implement it. Also a moving target -- there are always more exploits to be found.

The most reasonable scenario IMO would be simply detecting disproportionate filesize (Image reports X*Y pixels, filesize is significantly bigger than the estimated worst-case encoded size of that amount of pixel data) and flagging the image for manual embed checking.

Actually, in practice, if the site transcodes/downscales the image, this is automatic. Of course it's then a resized sample that is the bane of HoarDers(owners of lots of big Hard Drives) everywhere. :P

The Freenet project and Wikileaks were dealing with stuff like metadata in JPEGs for mostly the same reasons. So if you're looking for a EXIF stripper or the like, that's where to go to ask. The way you'd write a stripper is to extract the image data (not decompressed), and recreate the file, apparently. JPEG compresses things in blocks and this is why we have lossless JPEG rotaters. See https://www.betterjpeg.com/lossless-rotation.htm for an example.

This doesn't protect from "yellow dot" style of hiding information inside an image (i.e. watermarks intrinsic to the image itself, visible to human eyes or not). There's also ways to hide data that even resizing an image won't quickly remove (frequency or phase domains I guess?). But this is veering off from the _appended_ data scenario where it's done outside of the image data level in the file and done to non-image data, instead.

One lazy solution that sometimes(often, for short sequences) fails because of randomly having magic bytes in innocent files is to search for the archive/EXE/video/etc. signatures that don't belong to a given file. If you tried this approach, you'd have to test that the following data actually has that type of data, and this is where it gets messy, fast. This doesn't sound like the way to go! XD

So, for JPEG files, if you're willing to throw away non-image data entirely, it works, but then you lose some nice metadata. For PNG and so on, you'd have to write a per-case method. Luckily, PNG is lossless, and thus full-trip-capable but it's still going to be quite slow if you have a lot of images per minute needing to be processed.

Updated by anonymous

  • 1