Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a dataset with images and another dataset as it's description:

attrs

There are a lot of pictures: people with and without sunglasses, smiles and other attributes. What I want to do is be able to add smiles to photos where people are not smiling. I've started like this:

smile_ids = attrs['Smiling'].sort_values(ascending=False).iloc[100:125].index.values
smile_data = data[smile_ids]

no_smile_ids = attrs['Smiling'].sort_values(ascending=True).head(5).index.values
no_smile_data = data[no_smile_ids]

eyeglasses_ids = attrs['Eyeglasses'].sort_values(ascending=False).head(25).index.values
eyeglasses_data = data[eyeglasses_ids]

sunglasses_ids = attrs['Sunglasses'].sort_values(ascending=False).head(5).index.values
sunglasses_data = data[sunglasses_ids]

When I print them their are fine:

plot_gallery(smile_data, IMAGE_H, IMAGE_W, n_row=5, n_col=5, with_title=True, titles=smile_ids)

faces output

Plot gallery looks like this:

def plot_gallery(images, h, w, n_row=3, n_col=6, with_title=False, titles=[]):
plt.figure(figsize=(1.5 * n_col, 1.7 * n_row))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
for i in range(n_row * n_col):
    plt.subplot(n_row, n_col, i + 1)
    try:
        plt.imshow(images[i].reshape((h, w, 3)), cmap=plt.cm.gray, vmin=-1, vmax=1, interpolation='nearest')
        if with_title:
            plt.title(titles[i])
        plt.xticks(())
        plt.yticks(())
    except:
        pass

Then I do:

def to_latent(pic):
with torch.no_grad():
    inputs = torch.FloatTensor(pic.reshape(-1, 45*45*3))
    inputs = inputs.to('cpu')
    autoencoder.eval()
    output = autoencoder.encode(inputs)        
    return output

def from_latent(vec):
with torch.no_grad():
    inputs = vec.to('cpu')
    autoencoder.eval()
    output = autoencoder.decode(inputs)        
    return output

After that:

smile_latent = to_latent(smile_data).mean(axis=0)
no_smile_latent = to_latent(no_smile_data).mean(axis=0)
sunglasses_latent = to_latent(sunglasses_data).mean(axis=0)

smile_vec = smile_latent-no_smile_latent
sunglasses_vec = sunglasses_latent - smile_latent

And finally:

def add_smile(ids):
for id in ids:
    pic = data[id:id+1]
    latent_vec = to_latent(pic)
    latent_vec[0] += smile_vec
    pic_output = from_latent(latent_vec)
    pic_output = pic_output.view(-1,45,45,3).cpu()
    plot_gallery([pic,pic_output], IMAGE_H, IMAGE_W, n_row=1, n_col=2)
    
def add_sunglasses(ids):
for id in ids:
    pic = data[id:id+1]
    latent_vec = to_latent(pic)
    latent_vec[0] += sunglasses_vec
    pic_output = from_latent(latent_vec)
    pic_output = pic_output.view(-1,45,45,3).cpu()
    plot_gallery([pic,pic_output], IMAGE_H, IMAGE_W, n_row=1, n_col=2)

But when I execute this line I don't get any faces:

add_smile(no_smile_ids)

The output:

enter image description here

Could someone please explain where is my mistake or why it can happen? Thanks for any help.

Added: checking the shape of pic_output:

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
4.5k views
Welcome To Ask or Share your Answers For Others

1 Answer

Wild guess, but it seems you are broadcasting your images instead of permuting the axes. The former will have the undesired effect of mixing information across the batches/channels.

pic_output = pic_output.view(-1, 45, 45, 3).cpu()

should be replaced with

pic_output = pic_output.permute(0, 2, 3, 1).cpu()

Assuming tensor pic_output is already shaped like (-1, 3, 45, 45).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...