Topic: Anyone data science furs remotely interested in starting a furry specific training database for deep learning?

Posted under Off Topic

Hi All. I was interested in perhaps starting a project as titled. I was thinking of designing a small web frontend for selecting parts of images to categorize as different objects in the furry artworks available on this site. The results would be open source, compiled and available to download in a weekly update or something similar.

The end goal of this would be to help make some data available for everyone to play around with for autotagging (caffe), neural style transfer, or other deep learning objectives, with mainly hand drawn, as opposed to photographed, artwork.

I have a number of other software projects going on, so before put effort into creating something like this I wanted to get an idea if anyone else would be actually interested / participate in the data labeling at all.

Thanks for reading!

Updated

So you want people to draw bounding boxes and tag them.

I think the dumber approach of dumping all of the images and existing tag data into an algorithm should be tried first. Some tags based on outside information can be blacklisted, but even artists could be detected through art style.

Updated by anonymous

Thanks for the replies and suggestions all ^_^

@felix: I looked at the page you linked for Zstandard, but I am not really how how this compression standard is related, except tangentially, to creation of a training set? Please elaborate.

@Lance Armstrong: Yes, this is the general idea. I might be able to incorporate some additional efficiency improvement like canny detection / background removal. The training set would also generally need to be cast to a certain size / aspect ratio to be usable in a CNN, so decisions about which objects to bound would be influenced by this. I agree for style transfer, using whole images might work too. But for tagging I have my doubts using whole images with all of the labels associated will work well. Is there already an effort started to this end?

@KiraNoot: I had a little more experience with caffe/ssd, but it looks like yolo has gained more traction since my last foray into this. This project would not really be tied to any specific training algorithm, though. I am just looking to facilitate having something standardized in an easy format to we could start playing around with it. :3

Updated by anonymous

  • 1