0
点赞
收藏
分享

微信扫一扫

调用Blip image tokenizer实现image2text


from transformers import BlipProcessor, BlipForConditionalGeneration
from PIL import Image

# Load the model and processor
processor = BlipProcessor.from_pretrained("huggingface.co/Salesforce/blip-image-captioning-base")
model = BlipForConditionalGeneration.from_pretrained("huggingface.co/Salesforce/blip-image-captioning-base")

# Load and preprocess an image
img = Image.open("data/input_image.png")
inputs = processor(img, return_tensors="pt")

# Generate caption
out = model.generate(**inputs)
caption = processor.decode(out[0], skip_special_tokens=True)

# Print the generated caption
print(caption)


举报

相关推荐

0 条评论