Topic: e621 in graphs

Posted under Art Talk

These graphics have been generated based on posts' score rather than favcount.

Feral vs anthro: http://i.imgur.com/d7yoDAG.png
Male vs female: https://i.imgur.com/QOFKKUw.png
Canine penis vs horsecock vs humanoid: http://i.imgur.com/Qb7uVeX.png
Pokémon with more pictures: http://i.imgur.com/YS7RFru.png
Pokémon with less pictures: http://i.imgur.com/2x91oJF.png
Pokémon with the highest explicit/total ratio: http://i.imgur.com/JtDaijy.png
Posts per second: http://i.imgur.com/CsJvGaV.png

Note "none" in all means that neither of the tags appear

Updated

Lizardite said:
Note "none" in all means that neither of the tags appear

So in the search "order:score" I've looked for "canine_penis", "horsecock" and "humanoid_penis". Those marked as "none" means nothing but that in that page there were none of those penis, not that there were penises but they weren't tagged.

Updated by anonymous

RustyTorpedo said:
i like all penises

I like humanoid.

Updated by anonymous

well, cleared a bit more than 1 page of results and wound up also fixing a number of other tags as well.

makes me wonder if some people fail to realize that twys includes tagging male/female/intersex as well as genitals when visible. of course, laziness probably plays a part as well.

Updated by anonymous

So if you draw male x male content here you're ignored basically.

Updated by anonymous

thankyou for this. These stats please me greatly.

How are you converting the data?

Updated by anonymous

CuteCoughDeath said:
thankyou for this. These stats please me greatly.

How are you converting the data?

I've coded a pair of scripts. One fetches XMLs from the API and the other parses those XML and generate a GNUplot-ready TSV file, which I rendered using Libreoffice Calc (cause I am too lazy to figure out how to draw this kind of graphics on GNUplot)

Updated by anonymous

Lizardite said:
I've coded a pair of scripts. One fetches XMLs from the API and the other parses those XML and generate a GNUplot-ready TSV file, which I rendered using Libreoffice Calc (cause I am too lazy to figure out how to draw this kind of graphics on GNUplot)

I've always had a problems with the converting xml to GNUplot. I was using baseX but i'm shite at DOM type requests. Any chance of maybe letting a little kitty look?

Updated by anonymous

CuteCoughDeath said:
I've always had a problems with the converting xml to GNUplot. I was using baseX but i'm shite at DOM type requests. Any chance of maybe letting a little kitty look?

Oh sorry, I had totally missed your reply. The script I used for the Pokémon one is the following:

fetch.sh:

#!/bin/bash

PAGE="http://e621.net"

getcount() {
	local URL="$1"

	local COUNT=$(wget -q -O- "$URL" | grep -oh -m1 " count=\"[0-9]\+\"")
	COUNT="${COUNT:8:-1}"
	if [ -z "$COUNT" ]; then
		COUNT=0
	fi
	echo "$COUNT"
}

while read TAG; do
	TAG=$(echo "$TAG" | tr ' ' '_')
	echo -n "$TAG	"

	ALLCOUNT=$(getcount "$PAGE/tag/index.xml?name=$TAG")
	echo -n "$ALLCOUNT	"

	if [ $ALLCOUNT -eq 0 ]; then
		ECOUNT=0
	else
		ECOUNT=$(getcount "$PAGE/post/index.xml?tags=$TAG%20rating:e&limit=0")
	fi
	echo "$ECOUNT"
done

pokemon.txt (one tag per line, must be in UTF-8):

Abomasnow
Abra
Absol
Accelgor
Aegislash
Aerodactyl
Aggron
Aipom
Alakazam
Alomomola
Altaria
Amaura
Ambipom
Amoonguss
...

You put them both in the same folder and then in a shell terminal (I use Cygwin under Windows):

cat pokemon.txt | ./fetch.sh | tee results.txt

The output is saved in both results.txt and in the terminal (so you can see what's it doing), and they look like this:

Abomasnow	49	48
Abra	162	147
Absol	1173	1065
Accelgor	9	9
Aegislash	20	20
Aerodactyl	59	55
Aggron	159	147
Aipom	44	39
Alakazam	104	95
Alomomola	4	4
Altaria	59	55
Amaura	20	17
Ambipom	87	80
Amoonguss	31	29

A TSV (tab-seperated values) file with tag on the first column, then total post count, and then explicit post count.

Updated by anonymous

The post per second has been generated using two scripts instead. One fetches the posts and saves (caches) the XML, and then a second script process them.

First script (fetcher.sh):

#!/bin/bash

MAX=682586
POST=1

if [ ! -d posts ]; then
	mkdir posts
fi

while [ $POST -lt $MAX ]; do
	FILE=$(printf "posts/post_%08d.xml" $POST)

	if [[ ! -s "$FILE" ]]; then
		wget -O "$FILE" "https://e621.net/post/show.xml?id=$POST"
	fi

	POST=$(($POST+500))
done

This creates a folder with files called "post_X.xml" where X is the post ID.

Timestamp script:

#!/bin/bash

post_date() {
	local XML="$1"
	VALUE=$(grep -oh -m1 " created_at=\"[^\"]\+\"" "$FILE")
	echo "${VALUE:13:-1}"
}

while read FILE; do
	DATE=$(post_date $FILE)
	if [ ! -z "$DATE" ]; then
		echo -n "${FILE:11:-4} "
		date --date "$DATE" +%s
	fi
done <<< "$(ls -1 posts/post_*.xml)"

This one process all the XMLs on the said folder and outputs all the data, which looks like this:

00001001 1171932670
00001501 1172633742
00002001 1172868568
00002501 1173396584
00003001 1173885189
00003501 1174276926
00004001 1174718105
00004501 1175164135
00005001 1175900772
00005501 1176086010
00006001 1176808361
00006501 1177311281
00007001 1178258950
00007501 1179092933
00008001 1180227955
00008501 1182220347
00009001 1183460885
00009501 1185065439
00010001 1186250613

Updated by anonymous

Lance_Armstrong said:
What is the posts per second spike around 130,000?

I don't know. That spike actually goes off the chart and reaches a whooping 0.01 posts per second.

Updated by anonymous

  • 1