Back

Google Product Search vs Vertex AI multimodal embeddings

Google offers a managed service for searching with images to find similar products called Product Search. It's a quite neat service that allows you to do product search without manually embedding images, setting up a vector database, manipulating images etc. The pricing is quite predictable with 1 USD per 10000 images stored and 4.50 USD per 1000 queries using images. The service has not been updated much by Google, and it has no graphical interface in the Cloud Console, which makes it a bit hard to use. This, together with the first generation of Google Cloud AI services, seem to be neglected in favor of the new Vertex AI.

In this article I will compare the performance of managed Product Search with embedding images myself with Google Vertex's multimodal embedding. These two solutions will be referred to as Product Search and Vertex.

Methodology

I got target images from 380 products from one store categorized as "Eggs and diary" which form the search space. Then I found 13 test images of products from other stores or offer catalogs, all 13 judged by me to belong to the "Eggs and diary" category.

For Product Search the target images were indexed and set to the "general" category with no extra tags. Then the test images were queried to the product collection.

For Vertex multimodal the images were sent to the multimodal@001 embedding endpoint with only the image and no text. The test images were queried with scikit-learn NearestNeighbors (k=3) and cosine similarity.

Results

Test image

Results Product Search

Go Morgen Yoghurt Score: 0.53
Go Morgen Yoghurt Score: 0.52
Go Morgen Yoghurt Score: 0.51

Results Vertex multimodal embedding

Go Morgen Yoghurt Score: 0.75
Go Morgen Yoghurt Score: 0.74
Go Morgen Yoghurt Score: 0.73

Test image

Results Product Search

Sjokomelk Score: 0.46
Sjokomelk Score: 0.46
Litago Melk Score: 0.34

Results Vertex multimodal embedding

Sjokomelk Score: 0.73
Sjokomelk Score: 0.68
Tinemelk Sjokolade Score: 0.65

Test image

Results Product Search

Tinemelk Sjokolade Score: 0.47
Sjokolademelk Score: 0.46
Melk m/Vanilje Score: 0.45

Results Vertex multimodal embedding

Helmelk Score: 0.60
Geitmelk Score: 0.56
Helmelk Score: 0.55

Test image

Results Product Search

Sjokomelk Score: 0.39
Sjokomelk Score: 0.38
Cottage Cheese Score: 0.36

Results Vertex multimodal embedding

Sjokomelk Score: 0.69
Sjokomelk Score: 0.67
Helmelk Score: 0.65

Test image

Results Product Search

Melange Margarin Score: 0.66
Melange Margarin Score: 0.66
Melange Margarin Score: 0.55

Results Vertex multimodal embedding

Melange Margarin Score: 0.57
Melange Margarin Score: 0.57
Melange Flytende Score: 0.56

Test image

Results Product Search

Melange Margarin Score: 0.30
Havregurt Score: 0.30
Milkshake Score: 0.30

Results Vertex multimodal embedding

Brelett Score: 0.51
Brelett Score: 0.46
Bremykt Score: 0.46

Test image

Results Product Search

Yt Restitusjon Score: 0.45
Yt Restitusjon Score: 0.42
Yt Proteinmelk Score: 0.42

Results Vertex multimodal embedding

Yt Restitusjon Score: 0.60
Yt Proteinmelk Score: 0.57
Yt Proteinmelk Score: 0.57

Test image

Results Product Search

Gårdsegg L Score: 0.80
Gårdsegg Score: 0.44
Egg Økologisk Score: 0.37

Results Vertex multimodal embedding

Gårdsegg L Score: 0.88
Gårdsegg Score: 0.87
Gårdsegg Score: 0.77

Test image

Results Product Search

Meierismør Score: 0.90
Meierismør Score: 0.88
Smør Usaltet Score: 0.85

Results Vertex multimodal embedding

Meierismør Score: 0.95
Meierismør Score: 0.91
Smør Usaltet Score: 0.87

Test image

Results Product Search

Sjokomelk Score: 0.47
Sjokomelk Score: 0.47
Sjokolademelk Score: 0.39

Results Vertex multimodal embedding

Sjokomelk Score: 0.90
Sjokomelk Score: 0.86
Sjokolademelk Score: 0.78

Test image

Results Product Search

Syrna Mjølk Blå Score: 0.85
Bakemjølk Score: 0.46
Lettmjølk Fersk Score: 0.44

Results Vertex multimodal embedding

Syrna Mjølk Blå Score: 0.93
Biola Syrnet Melk Score: 0.66
Skummet Mjølk Score: 0.65

Test image

Results Product Search

Biola Syrnet Melk Score: 0.52
Biola Syrnet Melk Score: 0.48
Biola Syrnet Melk Score: 0.46

Results Vertex multimodal embedding

Fløte Score: 0.60
Plantgurt Score: 0.59
Biola Syrnet Melk Score: 0.58

Test image

Results Product Search

Bremykt Score: 0.41
Soft Flora Score: 0.38
Gryr Smøreblid Score: 0.37

Results Vertex multimodal embedding

Bremykt Score: 0.61
Bremykt Score: 0.61
Bremykt Score: 0.60

Discussion

Both solutions get most of the images right, and rank some images in a slightly different order when they have it right.

For "Tine Lettmelk med smak" and "Biola skogsbær/vanilje", only Product Search found a good match, while Vertex did not. These images were from offer catalogs, and had text in the foreground, which probably made the embeddings less likely to identify only the product image.

For "Philadelphia", there was no good match in the search space, so both solutions found other matches, these matches having lower scores than for images where there were good matches. Both ranked butter or margarine highest.

For "Smøremyk", which had no very visually similar matches, both found similar products, i.e. butter-imitation spreads.

Conclusion

It seems that Product Search offers a better solution than using only multimodal embeddings and KNN queries. Perhaps Product Search does something in its pipeline to identify objects and extract the most relevant part of an image. For the Vertex solution to be equally good, we might have to preprocess images to identify products before embedding them.

On pricing and management

Product search is a ready to use service, where experts at Google have made a good solution for the specific purpose of recognizing similar products from images. One only has to upload items for it to work. Using Vertex involves more steps and is, as demonstrated, not as good without optimizing the processing pipeline.

For indexing images one time, the price is the same for Vertex and Product Search: 1 USD per 10000 images. Product Search charges this monthly, while with Vertex one could store the embeddings much more cheaply. For querying, Product Search charges 4.50 USD per 1000 queries, while for Vertex the cost of querying would vary depending on how one would do it. If the goal is to find similar products one time, then this could be done in a Python workbench for a negligible cost. To query continuously, the cost of managing a database capable of KNN search would depend on many factors. One could perhaps assume that the cost to run a vector database such as Google Vector Search is similar to the cost of using Product Search.

Product Search has the capability of storing metadata with products, so that it is possible to filter search results. Using Vertex, one could of course filter and sort as one wished with any database system.

Vertex multimodal embedding also allows for embedding text strings in English in the same 1408-dimensional space as images, so searching your images with only text is not possible if using Product Search.