Digital Engagement & Cultural Heritage

Team members

Project leader: Joris Pekel

Members: Michael Stevenson, Emile den Tex, Matteo Azzi and sub-project team members (listed below)


Cultural heritage institutions are increasingly digitizing their collections and making them available online. Such projects must be evaluated in terms of the levels and quality of user engagement, but how should engagement be defined and measured? This project combines 4 sub-projects and several cases studies in order to:

  1. investigate the different levels and forms of engagement surrounding the collections of various 'open' and 'closed' cultural heritage institutions, with 'openness' referring to the accessibility of their digitized collections.
  2. investigate the specificity of digital engagement on different platforms, most importantly Wikipedia and Instagram. What would an art collection look like if optimized for inclusion in Wikipedia articles or for Instagram selfies?

Project 1: Where credit is due

Group Members: Felicitas Deiler, Leonie Herma, Mathilde Grenod


“The internet presents cultural heritage institutions with an unprecedented opportunity to engage global audiences and make their collections more discoverable and connected than ever, allowing users not only to enjoy the riches of the world’s memory institutions, but also to contribute, participate and share.” (OpenGLAM.) The core engine of cultural heritage’s digitization is fabulously enabled by the progress of technology, and its spreading on the web is the product of information communication technologies (ICTs.) Nevertheless, despite the facilitation of free access for all users and all around the world, the spreading of digitization online is heavily relying on cultural institutions’ policies in terms of knowledge access, sharing, modification and use. In this research, we will particularly focus on public cultural institutions’ policies, whether they are rather “open” or “closed” institutions.

In addition to public institutions’ policies, the impact of digitized work (as in “artwork”) is strongly determined by search engines structural functionalities online. In our specific case, Google’s interface is crucial to organize a research on cultural heritage sharing online. Google’s protocol in terms of search is based on both what results are considered as “relevant” (looking at their popularity; Hariri, 2011) and, in most cases, on users’ online habits (browsing history, most visited websites, previous purchases, etc..) through data collection by the search engine (Rogers, 2013.) Eventually, the combination of public institutions’ various knowledge spreading policies and Google’s protocols brings an interesting question. What if the results given by Google when a work is queried by the users were determined by the status of “open” or “closed” public institution? In other words, to what extent do public institutions get credit on Google depending on their status of “open” or “closed” digitized collections?

The purpose of this study is to evaluate to what extent cultural institutions might influence Google’s functionalities. Open knowledge organizations play a noticeable role in our research process here, in particular Europeana, a multilingual collection of millions of digitized items from European museums, libraries, archives and multimedia collections. We selected six different public institutions for this paper, three have “closed” policies whereas three others have more “open” knowledge sharing objectives. We chose these public institutions in order to establish a comparison of the results given by Google and determine whether Google would engage more one type of institution than the other. We selected the top two most queried artworks for each institutions on Europeana’s website based on Europeana’s analytics database for 2015. It provided us a solid database in order to draw a parallel with Google functionalities. After explaining our methodology for this research and providing the results, we will discuss the findings and ultimately conclude.


With this database provided by Europeana we were able to take out the two most accessed artworks for each of the institutions we chose for our research. In order to examine our research question we scraped Google for the name of the artwork using the original title as given by Europeana. For that we firstly had to adjust the settings of our browser Firefox in order to ensure that our research results are not biased by the search engine itself. We therefore used the Google Scraper Tool from the Digital Methods Initiative and inserted the artworks’ title into the second box, the query field. Scraping Google for the first 100 query results of that title we got a text file as an output. In order to extract the URLs of this text file we applied the Harvester Tool from the DMI. The cleaned list of URLs was now copied into the first box of the Google Scraper Tool. These URLs were then scraped for the institution’s name; we thus inserted this into the second field, adapt the number of query to ten and activated the tool. Following these steps allows us to see how often a cultural institution is named within the first 100 Google results of the query of a particular artwork title.

We first selected the two most queried artworks on the website of Europeana according to Europeana's 2015 analytics for each of the public institutions selected for our research:

Rijksmuseum's most two queried artworks on Europeana's website for 2015:


National Museum of Sweden's most two queried artworks on Europeana's website for 2015:


LSH's most two queried artworks on Europeana's website for 2015:


Uffizi Gallery's most two queried artworks on Europeana's website for 2015:


Museo del Prado's most two queried artworks on Europeana's website for 2015:


Musée du Louvre's most two queried artworks on Europeana's website for 2015:



The results of this research were visualized as tag clouds in order to see clearly how often the institution was named in the query for the artwork and how Google ranked them . Furthermore we created a network with the Gephi software, which should give us a better understanding to what extent the different institutions are connected. In the following we are presenting an example tag cloud for a "closed" and an "open" public institution, namely the Museo del Prado and the Rijksmuseum.

  • Rijksmuseum-Most queried artwork on Europeana's website for 2015-"Kaart van Europa, ingekleurd als koningin, Europa volgens de nieuwste verdeeling"12562650_1083932281641650_1705736763_o.jpg
In the tag cloud above one can observe that for the Rijksmuseum, an "open" institution, the first and also the most mentioned query result was the webpage of the institution itself. The institution was mentioned 323000 times within the artwork queried. This can not only be explained due to its status of openness but also because the Rijksmuseum has a very active and big website itself. Furthermore it was striking that most websites that mentioned the Rijksmuseum were webpages of official institutions like archives, museums and journals.
  • Museo del Prado-Most queried artwork on Europeana's website for 2015-"El Pamaso":

This tag cloud illustrates the order in which Google ranked its results for the artwork "El Pamaso" belonging to the Museo del Prado. The museum was mentioned most within the top 100 Google results for "El Pamaso" on The websites engaging public institutions most were largely user-generated content websites like Pinterest, Yelp, Facebook and mainly Tripadvisor. The first website that appeared on the query on Google is Wikipedia which is interesting because within the query of the Rijksmuseum Wikipedia did not show up. We can assume that closed institutions are more engaged with their artworks on user generated websites because they do not spread digitized versions of their artworks themselves. As a consequence users upload pictures of the artworks (sometimes illegally since it is often forbidden to take pictures in museums)

For the open institutions we looked into, it was very striking that they had a bigger chance for their own websites to appear higher in the Google ranking than the closed institutions. We could observe that the “open” institutions’ artworks appear more clearly on cultural instutions’ webstites with Google’s queries results whereas the “close”d institutions were mainly mentioned on user generated websites like, and This can be explained by the fact that the “closed” institutions are less likely to spread their digitized artworks online themselves; instead, internet users would acquire the digitized version of the artwork and share it on social media websites. One can therefore assume that the user-generated websites care less about copyright as users share content even though there are restrictions to those artworks. Global Gephi vizualisation of "open" and "closed" public institutions:

Gephi vizualization of "opened" and "closed" institutions networks:


Zoom into the main clusters of the institutions' networks


Visualizing the collected data with Gephi some structures become more visible. Two main clusters can be identified here. The "opened" and "closed" institutions build respectively networks and are linked within their clusters. Especially the "closed" institutions are linked with user-generated websites like Facebook, Youtube and Pinterest. What is striking as well is that the second digitized artwork we queried from the Rijksmuseum connects more to the "closed" institutions than the other artworks from the "open" institutions. A possible explanation for that could be that the title of the second artwork from the Rijksmuseum is “Winter Landscape” which can also refer to general photographs of winter landscapes; it does not necessarily refer to the artwork itself and therefore shows up on sites like etc...


To conclude we can definitely assume that "closed" institutions are more mentioned in the context of social media platform. This can be interpreted back to the "closed" status of the institution and is interesting because these institutions try to prohibit the spreading of their digitized artworks. They try to keep control over their artworks, a policy that might not be too efficient as pictures seem to spread even more and this without the power in their hands. Open institutions support the spreading of their artworks and therefore get “more credits” for their institution as their official websites appear generally further up within the Google queries. Something that might not be working for "closed" institutions either. Perhaps "closed" public institutions might want to question their strategy of digitzed collections, even in a popularity objective online.



Hariri, Nadjla. “relevance ranking on Google-Are top ranked results really considered more relevant by the users?”Online Information Review, Vol 35:4 (2011), pp.598-610.

Open Definition, “Open Definition 2.1”,

OpenGLAM, OpenGLAM Principles,

Rogers, Richard.”Digital Methods”, The MIT Press-Chapter 4 “Googlization, The Inculpable Engine, pp.62-83.

Project 2: Pixel this

Team members: Radvile Dauksyte, Rosa Boon, Suzanne Tromp and Lianne Kersten

Galleries, libraries, archives and museums take up the role of custodians of the world’s cultural heritage institutions to engage global audiences and make famous collections more discoverable and connected (OpenGLAM). However, the debate regarding the line between art as intellectual property and art belonging to the public domain remains, and not all institutions are feeling the need to digitize and open up their their database yet.

According to, the largest online database of European artworks, the public domain is the material from which society derives knowledge and fashions new artworks. Therefore, a healthy public domain is essential to the social and economic well-being of society. But how do we protect the interest of artists and museums, while at the same time ensuring worldwide access to all heritage and knowledge? And what does this worldwide open access mean in the first place?

“Open collections” is a museum technology term that refers to a museum that has “opened” up all of their digital collections, and accompanying data, to be freely used by anyone (Kelly). This digitization of cultural heritage creates endless possibilities for sharing and reusing art by sampling, remixing, embedding, illustrating, doing research and so forth. An example of such an open institution is the Rijksmuseum, which provides free access to high-quality digital images of their artworks and even encourages remixing and personal usage (Kelly). In contrast to the openness of the Rijksmuseum there are “closed” institutions such as the Uffizi Gallery, whose digital databases are not open to the public, making it harder to digitally access their artworks. However, whether an artwork is easily accessible or not, images of the artworks owned by these institutions do circulate on the web in various sizes, qualities and sometimes even different colour schemes than the original work of art.

For this reason we chose to compare two “open” institutions, the Rijksmuseum in Amsterdam and the Tate Modern in London, with two “closed” institutions: the Louvre in Paris and Uffizi gallery in Florence. We will explore the differences and similarities between these institutions by looking at the quality of the digital images they put forward and the ways these are being used on the web. We focused primarily on paintings because with this artform, differences in quality and colours become clearly visible when they are digitized and reused. Another question that is interesting to look at, is in what way people engaged with these images; where and how were they used? And are there any correlations to be found between the quality of images and the type of source they are used on? By answering these questions, we are adding to the ongoing debate about whether digitalized artworks should belong to the public domain or not. Taking the above aspects into account, we formulated the following research question:

How does the accessibility and quality of art-institutions’ (open vs. closed) digitized artworks influence the quality of the images of those works available online, and how does this affect the engagement with those images?

Visualizations & Findings

This section will discuss the findings of this research, and will provide an answer to whether or not it is beneficial for art institutions to open up their digitized collection to the public. The findings will be substantiated through the use of visualizations.

Open vs. Closed

Museums and other cultural institutions are a product of the age of Enlightenment and its encyclopedic spirit (Bertacchini and Morando 3). Traditionally seen, their core mission, as Bertacchini and Morando explain, was to “preserve, catalogue and develop” a collection and provide access to it in order to disseminate national and global culture to the general public and make the material available for research (3). But, with the rise of the internet and the accompanying spread of digital images of art, these traditional roles in the cultural world are changing. Culture goods used to exist solely in analog form and therefore institutions were able to treat their products like property (Hughes and Lang 2). With the shift from analog to digital, cultural goods have become more “fluid”, easily distributed online and therefore available for “extension, recombination and innovation” (Hughes and Lang 2). Audiences may no longer prefer receiving information and content passively because they are increasingly used to participating in an active way, to contributing their own “knowledge, attitudes and creativity” (Sanderhoff 23). However, some cultural institutions try to keep their collections “fixed” by protecting it with copyright barriers. Today, museums are faced with a clear tension between providing access, versus extending their control over their digital collections (Bertaccini and Morando 2). As Bertaccini and Morando describe, on the one hand, museums could enhance the economic and social value of their collections by distributing digital images that can then be reused in different ways. But on the other, by controlling their digital collections, museums can maintain their roles of gatekeepers of “authenticity, integrity and contextualization”, and in the process create new revenue for themselves (Bertacchini and Morando 2). As mentioned before, a cultural institution is open when anyone is free to “use, reuse and redistribute a piece of its data or content –– subject only, at most, to the requirement to give credit to the author and/or making any resulting work available under the same terms as the original work” (OpenGLAM). The contrasting closedness of an institution might even collide with the traditional purpose of the museum. According to Bertaccini and Morando, digital cultural collections can be seen as a form of public goods that are non-rival and non-excludable (4). Non-rival meaning that, if the good is used by one person, it does not reduce its availability for others. Non-excludable meaning that if the good is made available to one person, others cannot be prevented from accessing it (Bertaccini and Morando 4). An idea that correlates with the theory behind the concept of “cultural commons” that we will now continue to discuss further.

Cultural Commons

The concept of “commons” is often adopted to conceptualize the aforementioned dilemmas that arise with the new territory of global distributed information (Hess 6). A commons is defined as a “general concept that refers to a resource shared by a group of people” (Hess 3). A concept that provides us with a new way of looking at what is shared, or should be shared in the world around us. According to this theory, the focus should be on collective action and the importance of understanding “who shares what, how we share it and how we sustain commons for future generations” (Hess, 2008). In the cultural sector the digitalization of the vast collections of artworks housed by institutions, together with the option of limitless access through the Internet, have sparked a new belief in the art world, namely that these digitized resources should be “set free” as cultural commons (Sanderhoff 64). So, in regards to cultural heritage, this means that when we view digital cultural heritage as commons material, we see digital representations of art (and the accompanying metadata) as shared public resources that we should all be able to access and maintain (Cousins 135).

Open Policies

Looking at cultural heritage in this manner would mean that cultural institutions should wield an open policy when it comes to their digital database. According to Von Haller Grønbaek, all cultural institutions should endeavour to be as open as possible in the sense that as many people as possible should have the easiest access possible to the institution’s content (141). At the same time, the institution should seek to ensure that the freely available content is shared, enriched, and processed by users, whether they are citizens, students, scholars, researchers, or commercial ventures (Von Haller Grønbæk 142). As Von Haller Grønbaek states in the book Sharing is Caring: Openness and Sharing in the Cultural Heritage Sector, the value of culture is “directly proportional” to the amount of people who get to experience it (141). Therefore, Von Haller Grønbaek claims, ideally the objective of all cultural institutions should be to have the biggest amount of knowledge possible, accessible to as many people as possible (141). In this way, the institution can disseminate “as much knowledge and insight” as they can (Von Haller Grønbaek 141). When it comes to art this is of great importance because culture should not just be seen as an “fixed” object that is consumed passively by the person viewing it, but as something that is alive and most “vibrant” when experienced together and shared amongst people (Von Haller Grønbaek 141). This process of sharing will eventually yield more culture in return because it helps shape the basis on which “our present and future culture, democracy, economy, and all other aspects of society are based” (Von Haller Grønbaek 142). This idea of sparking creativity by sharing is also noted by Michael Edson, who states that when creators have free and unrestricted access to the work of others through the public domain, “innovation flourishes” (Edson 136). The importance of offering access to digital cultural heritage is strongly promoted by the OpenGLAM (an abbreviation for Galleries, Libraries, Archives and Museums)

initiative. According to them, if cultural institutions do not evolve with modern technologies, they run the risk to “at best, become relics of a bygone era, at worst stagnant and forgotten cultural archives” (Sanderhoff 30).

Digital Methods and Art

In order to analyse and compare the two open institutions with the two closed ones, we have used several methods to gather web data that we used to answer our research question. As Richard Rogers, founder of the Digital Methods Initiative argues, by analysing search engine results using digital methods, one can study cultural trends as they are manifested on the internet (19, 86). In this way one is able to turn results – in our case: search engine images of artworks – into indicators and findings. Digital methods enable us to make “derivative works” from those results (Rogers 3), from which we can deduct the link between the quality of the artworks and the open- or closedness of the institution. Research based on digital methods regarding this topic remains scarce and the field of digital museum practice is still in its infancy (Sanderhoff 28). Through our research we hope to concretize how the art world can benefit from digital method research. We hope to provide an argument, based on solid data, for museums to open up their digital collections to the public, and through these findings show the benefits of opening up a digitized art collection for institutions. In the following method section we will explain the choices we made during our research process.

Visualizations & Findings

This section will discuss the findings of this research, and will provide an answer to whether or not it is beneficial for art institutions to open up their digitized collection to the public. The findings will be substantiated through the use of visualizations.

Open vs. Closed

When looking at the policies of the aforementioned institutions to find out more about their open or closed status, we got the following results. First, we found that in order to obtain high resolution images from Uffizi, you have to request the image via an online form, and at the same time ask for permission to use the image. Secondly the Louvre, that does offer an online database of their collections, though the image quality is limited and the images are restricted by copyright. In contrast to Uffizi and the Louvre, the Rijksmuseum can be characterized as an “open” institution, offering online visitors the option to freely download high resolution images for personal usage, without copyright limitations. The Tate Modern can be characterized as a semi-open institution, offering low resolution watermarked images for layout purposes, and high resolution images, with copyright, with an added reproduction fee. Figure two shows an overview of the openness of the institutions defined on the accessibility and the quality of the images available on the institutional websites (see figure 2).






The whole collection is available for download.

High Resolution



Most of the collection is searchable, saving is possible. Copying or reproducing however, except for personal use, is prohibited.

Low Resolution


Tate Modern

All artworks older than 70 years old are easily accessible. More recent works are not shown on the website.

High Resolution (When the artpieces were older than 70 years)

Open (semi)


Only a selection of the collection can be seen online.

Low Resolution


Figure 2 | Degree of openness/closedness of the institutions based on accessibility and quality of the images on their websites.

It was noted that not all images were available for download on the official museum pages. For example, the Uffizi gallery had only a few paintings available to download, in comparison to Rijksmuseum whose paintings were largely available for download in high resolutions. The accessibility also correlated with the quality of the images, which will be discussed later. The Tate Modern museum, for example, allows saving any image from their website while Louvre does not allow images to be saved or downloaded at all (in this case we searched for the image URL in the source code of Louvre’s website).

Top 10 Most Popular Paintings per Institution

Next, we started to make a ranking of the most popular paintings for each institution based on Google results (See Figure 3). We defined popularity based on most results returned out of top 10 paintings that we chose per institution. Out of them the most popular painting was chosen depending by the total number of results returned on Google Images (highlighted in Figure 3). This was The Milkmaid by Johannes Vermeer in the Rijksmuseum, Angels Announcing Jesus Birth to Shepherds by Govert Flinck in Louvre, The Birth of Venus by Sandro Botticelli in the Uffizi Gallery and The Mud Bath by David Bomberg in Tate Modern. It is remarkable that in three out of four cases a painting with the most online results was not indexed as no.1 in Google images. It might be a possibility that Google’s ranking algorithm calculated or “learned” that other paintings were more important than the images with the most online results and therefore ranked those other paintings higher.




Tate Modern

The Battle of Waterloo

Portrait of Lisa Gherardini

The Birth of Venus

The Snail


Angels Announcing Jesus Birth to Shepherds


Bathers at Moritzburg

Isaac and Rebecca

The Consecration of the Emperor Napoleon and the Coronation of Empress Josephine

Tribuna of the Uffizi

Dynamic Suprematism

The Milkmaid

Portrait of King Louis XIV

Nativity of Jesus

Metamorphosis of Narcissus

Banquet of the Amsterdam Civic Guard in Celebration of the Peace of Münster


Judith Slaying Holofernes

The Weeping Woman

Rembrandt’s Self-portrait

Oath of the Horatii

Doni Tondo

The Mud Bath

Sir Thomas Gresham

Liberty Leading the People


Endless Rhythm

Fishing for Souls

Saint Michael Overwhelming the Demon

Madonna with Child and Two Angels

Untitled (Bacchus)

The Threatened Swan

Adoration of the Shepherds

Portrait of Bia Medici


Syndics of the Drapers’ Guild

Allegory of Fortune

Portraits of the Dukes of Urbino


Figure 3 | Top 10 Google results from each gallery. Highlighted are the paintings that delivered the most individual search results.

We could already note the accessibility of the images while trying to download them from the institutional pages. Yet in order to compare the quality of the uploaded images from every institution and to evaluate this in terms of openness of a museum, we put together a graph (see Figure 4). This graph shows the correlation of the file sizes from institutions depending on open and closed museums. The vertical axis represents the file size (multiplied height and width) while the horizontal axis shows the top ten paintings from each institution (here the letter P means painting). What can be seen from this graph is that, as an open museum, images from the Rijksmuseum were of the largest quality, mostly because the original images are of high quality and available for download. It can also be observed that images of paintings from the Uffizi gallery were large, although it is a closed institution. However when the source of the largest image, painting number five, was checked, it was seen that it was uploaded to Wikimedia Commons. It being in the public domain in this manner is in all probability the reason for its higher quality. On the contrary, images of paintings from Tate Modern museum were of smaller sizes. This could be because Tate is a semi-open institution and more importantly, most of the artworks are still covered under copyright laws because most modern art paintings were drawn relatively recent. There are also some interesting findings that can be read from this graph, which we think however are most likely to be coincidence. Amongst the top ten results, numbers one and ten were roughly the same size from all institutions. Also, at the mark of P6, the graph resembles a dip in every institution.

Figure 4 | The file size per institution for the top 10 most popular artworks on Google.

Top 20 Google Image Results per Institution

Next to this, we researched the sizes of uploaded images on the web, found by using the Google Reversed Image Scraper. We focused on one painting per institution, each painting was “most popular” in terms of returned results of all searched images from each institution (see Figure 3). In order to visualize the size differences, we converted our data into a treemap (see Treemap 1). The highlighted parts shows the image provided by the institution. As the treemap shows, the Rijksmuseum stood out when it came to the quality of the circulating images. Firstly, the quality of the images was high compared to the quality of the images of the other institutions. Secondly, and most importantly, the pictures in the top twenty were predominantly (8 out of 20 images) of the same high resolution and size as the institution itself provided. Another notable finding is that the highlighted pictures are in three out of four cases not the biggest filesize of an image available online. Images of Uffizi’s, Tate’s and Louvre’s differed strongly in quality and size. For these three institutions the original image provided by the museum was smaller than the largest found result on Google Images.

Treemap 1 | Used file sizes compared with original institutional filesize.

By analyzing the results of the treemap, it can be argued that, even though an institution does not provide the original high quality images on their website, larger images will nevertheless be uploaded and shared on the web. However, it must be noted that the images found on Google Images from the closed institution are still smaller in size than those of an open institution such as the Rijksmuseum, indicating that the closing off of a digitized art collection will, to a certain extent, retain the spread of high quality images of the artworks online.

This retainment is further underlined by the amount of results on Google Images when scraping the artwork: whereas The Milkmaid of the Rijksmuseum got 92998 hits on Google Images, whereas the artwork of the Louvre only got 763 hits. However, what must be noted, The Birth of Venus of the Uffizi Gallery, although it being a closed institution, got 83398 hits on Google Image search. One explanation can be that this image was taken from Wikimedia, instead of the original website of Uffizi. The most popular artwork by Tate Modern, a semi-open institution, was only shared 816 times. This can be explained by the fact that the most popular artwork is younger than 70 years old, and therefore sharing the digitized artwork is restricted by copyright laws. In addition to the difference in file sizes in the Google results and the one provided by the institutions, it has to be noted that the original artwork of Tate Modern and Uffizi did not show up in the top 20 results of Google Images. This indicates that these institutions, who own the original artwork, are not visible online. This could be one argument for an institution to open up their database, in order to become more visible online. To put the differences of file sizes in proportion, we made a visualization in which we have put the differences in sizes for each painting next to each other (figure 5).

Figure 5 | Proportion of files sizes of the paintings compared.

Just as we would be able to in real life, we can see how the images look compared to each other. However, the digitalized versions provide a different sight when it comes to proportion. For example, the original image of The Milkmaid is a lot bigger that the original image of The Birth of Venus. Which is actually conflicting with real life since the real Milkmaid is way smaller than the real Birth of Venus (45,5 cm x 41 cm compared to 172 cm x 278 cm). Another thing that can be deduced from this visualization is that being a semi-open institution, Tate Modern’s uploaded image was third by size of all results. This visualization also helps to show that both closed institutions, Louvre and Uffizi Gallery uploaded small images to their websites and even the largest available image from their top results was still smaller than Rijksmuseum’s image. Also, only the top result for Uffizi Gallery is large in dimensions, the rest of the images are comparably smaller.

Differences in colour

Besides visualising the differences in file size of each individual painting, we were also curious what the spread of different images of one painting meant for its colours. This changing of colours is also known as “The Yellow Milkmaid Syndrom”, a phenomenon the following quote explains very well:

The Milkmaid’, one of Johannes Vermeer's most famous pieces, depicts a scene of a woman quietly pouring milk into a bowl. During a survey the Rijksmuseum discovered that there were over 10,000 copies of the image on the internet—mostly poor, yellowish reproductions. As a result of all of these low-quality copies on the web, according to the Rijksmuseum, “people simply didn’t believe the postcards in our museum shop were showing the original painting. This was the trigger for us to put high-resolution images of the original work with open metadata on the web ourselves. Opening up our data is our best defence against the ‘yellow Milkmaid’. (Verwayen, Arnoldus and Kaufman 2)

The survey by the Rijksmuseum resulted in a project by Europeana: The Yellowmilkmaidsyndrome-blog, which is dedicated to collections of one painting in varied colours and quality including The Milkmaid from Rijksmuseum, Endless Rhythm from Tate Modern, and many more. By using the Color Thief tool, we tried to visualize the phenomenon of color changes in artworks found on the web as discovered through the Yellow Milkmaid project. The tool generated a colour palette for each image (see figure 6). Again, we highlighted the institutional image in order to make a clear comparison between the original and the other found images.

Figure 6 | Colour schemes of the institution’s most popular paintings.

What makes these color schemes interesting, is that it shows that the results of The Milkmaid of the Rijksmuseum now have a very high resemblance with the original file provided by the Rijksmuseum. This means that for the Rijksmuseum it has proven to be beneficial to open up their collection to the public, as the ‘YellowMilkmaid-syndrome’ has mostly disappeared from the current top 20 Google Images of the Milkmaid. Comparing this with the results of the Louvre, one can see a considerable difference: whereas the images of the Rijksmuseum show the highest resemblance with the original, none of the results of the Louvre have the same colour scheme as the original found on their website.

The same goes for the Tate Modern, where only a few resemble the original artwork. The results of the top twenty images on Google Images of Uffizi on the other hand show a high resemblance with the original work, this might be due to the fact that the painting is in high quality available on Wikimedia Commons. It might be possible that if we had made these visualisations with more than twenty results, it would have given more striking findings.

Source type of the Google Image Results

A final interesting thing we have researched was how people engaged with the digital art images. In order to visualise this we started by manually checking out all the sources that were connected to each image. We then classified this information in Excel with the following labels: blog, wiki, education, commercial, news, social media or institution. We organised this data again with the use of a tree mapping. As the following figures will show, we made an individual treemap for each institution, including all the Google Image results for each painting, to develop a clear overview of our findings.

Figure 7 | Engagement treemap of the Louvre.

Figure 8| Engagement treemap of the Tate Modern.

Figure 9 | Engagement treemap of Rijksmuseum.

Figure 10 | Engagement treemap of Uffizi Gallery.

After tracking down the type of source the top images were used on, it showed that the most popular channel for famous paintings differed per institution. However, there were some trends that could be noted, for example: images from both the Louvre and Tate Modern were mostly distributed in the blogosphere, while images from the Uffizi Gallery and the Rijksmuseum circulated in Wikimedia related pages the most. Of course the level of distribution across different channels varied, as well, for instance, Louvre’s images appeared notably more often in blogs than those originating from Tate Modern. The fact that the largest images were shared mostly on blogs show an interesting discovery because blogging platforms often require images to be resized for smaller ones. On the other hand, the largest images, which were from Uffizi and Rijksmuseum, were not shared on blogs, but Wikimedia pages. It can be seen that varying from institution to institution smaller images were distributed amongst different channels. For example, paintings from Tate Modern were largely used in news and on social media, yet the size of images used was comparably smaller to those used in blogosphere.

Furthermore, these treemaps show that the images shared in the blogosphere from the Louvre, a closed institution, were also higher quality than the image provided by the institution itself. One possible explanation could be that bloggers are less concerned with copyright infringement, and therefore are more likely to share high quality images, without obtaining the rights to share it. Contrasting this with the Rijksmuseum, an open institution, one sees that the high resolution images are mostly used in more official sources such as wikis and websites with an educational purpose. It is a benefit for an institution to open up their collection because digital images of their paintings are distributed on platforms through which the interest in the artwork or the museum itself are likely to increase. On the contrary, when accessibility is restricted, images are still distributed but instead across social media platforms, personal blogs or commerce, but however in lower quality.


There were a few obstacles we encountered while conducting our research. First, our team faced some problems while downloading images from the institutions’ websites, as some institutions did not have all paintings online in a database. In regards to the artworks picked from the Tate Modern’s collection, it in hindsight would have benefitted our study to pick some artworks that were older than seventy years old. As the Tate was the only modern art institution we examined, and the top ten artworks that were used in the research were still copyrighted, only low resolution images were available on Tate’s website. One consequence is that the Tate Modern appeared more closed in the results than previously expected. For further research, it would be interesting to also pick a younger work from the Tate, to see the differences in quality on the web. Also, a few URLs that were scraped were “not found” by the Google Image Scraper, which somewhat prevented us to form a plenary conclusion because it slightly affected the end results and visualizations.

Furthermore, it is important to account for the categories of websites we distinguished in our research. To make visuals clearer we decided to use use five categories only, which automatically led to a certain degree of generalization, especially when it was difficult to categorize a website under a certain label.

Naturally, the research would have benefited from a wider sample range. Due to project constraints only four institutions were analysed, narrowing down to a top twenty of paintings. To form better and further reaching conclusions, it would be a good idea to increase the number of researched institutions. For example, it might be interesting to look into the differences in image quality and availability between a big institution and a smaller one. Of course, in some ways, art can be considered biased. We have considered all paintings as of equal value and popularity, yet it is true that some paintings are more well known than others, for example, Angels Announcing Jesus Birth to Shepherds is not closely as famous as Birth of Venus in terms of returned results (763 and 83398 respectively) and this may have influenced our research. At the same time, while we defined popularity by most returned results, defining what is most popular is rather difficult because it depends on too many external factors, such as education, interest, intelligence, upbringing or cultural values. And while this is not the main focus of our research, we must account for the popularity bias that likely affected our findings.


Academics welcome the accessibility to culture in one click and plan great outcomes for the future (Stromberg). In our research we looked into the effects this wide accessibility has on the quality of the digitally spreaded images and the engagement this entails. From the results, it was observed that “open collection” institutions had larger quality images available for download while “closed collection” institutions had not. This is an important finding which shows that images which circulate online vary in quality when they reach the public eye. Cuno argues that digitization is crucial for cultural development: “Digital enables a web of connections that are the raw material of intellectual discovery for a casual visitor, a student, or an art historian” and it can be argued that the access to good quality images is a key component in the process of moving forward.

After an analysis of both the colour schemes as well as the extractions of the reversed image scraper it was obvious that the Rijksmuseum stands out in terms of openness: the original image source, coming from its institution is widely spread amongst different wikipedia and educational websites.

When it comes to engagement, some generalizations can be made about the way people engage with artworks from the four museums. It was seen that the Rijksmuseum was mostly featured on Wikipedia related articles which shows that they have universal use and people engage with Flemish art in different contexts. The same could be said about collection of Renaissance paintings in Uffizi Gallery. In the meantime, paintings from both the Louvre and Tate Modern were largely featured on blogs which shows personal engagement with the art when people express themselves and their impressions, they review paintings and discuss them. This shows that world-famous paintings are still very much relevant and provides a good reason to make them available to download legally. Also, it is possible that blogging platforms have also shrank the uploaded images which reduced rather small images even more. It can be observed that on contrary to other institutions, although Tate Modern paintings mostly figured in blogosphere, they were still largely distributed across news and social media which can indicate that this institution invests in keeping debates alive as well as in advertising current exhibitions.

Overall, this paper observed the spread world-famous artwork in terms of its size and quality. It was seen that “open collection” institutions add to the distribution of high quality images while artwork from “closed collection” institutions spread more in lower quality images. The only exception was Tate Modern which, being a semi-open museum, allowed to save medium-sized images, yet this can be explained by copyright laws: most of the contemporary art is still protected and cannot be used or distributed. The colour schemes showed to what extent the colours of the spreaded works resembled that of the original work.


Project 3: The Wiki-friendliness Index : Researching Digitization on Wikipedia

Group Members: Maria Charitou, Ana Garza, Srushti Jadhav, Alina Niemann


Cultural institutions have a “shared dream of a world where every citizen will have access to all cultural heritage” (Europeana Strategy 2015-2020, 4). As “a significant component in defining cultural identity, nationally and internationally” (Lewis 2004, 1), cultural institutions aim to strengthen their role by taking into consideration the fast spread and accessibility of information and the growing use of Internet and digital technology. “It is argued that technology is engaging and our response to it is to be involved (with it)” and “the degree of involvement does vary depending upon how engaging (and available) the digital technology is” (Turner, 33; 35). Therefore, cultural institutions and amongst them libraries, archives and museums nowadays tend to shift their approach from objects to users, and especially signifying one major factor, the user’s engagement. Notably, “museums have gradually started making more use of social media platforms such as Facebook, Twitter and YouTube, to communicate their activities and exhibitions and increase public engagement” (Spiliopoulou et al., 287).

Public engagement can be measured not only by studying several social media platforms like the ones mentioned above, but also by examining Wikipedia, the largest online encyclopedia that “anyone can edit”. Consisting of about 20 million articles accompanied by images and circa 270 language editions, Wikipedia is “sizable and also highly visible on the web. Of crucial importance for its significance is the appearance of its articles at the top of Google’s search engine results. [...] The overall popularity of the project is also often discussed in terms of how it empowers its users as “editors” and of its collaborative, rewarding culture that fosters continued engagement” (Rogers 2013, 165). In terms of this, we believe that cultural institutions, museums etc. can take advantage of Wikipedia’s high accessibility and its uniqueness into promoting knowledge and especially art and culture and benefit from studying user's engagement on Wikipedia. Wikipedia’s users, for example, tend to upload images of physical paintings when referring to a painting, an artist etc. that most of the time is being taken from ‘unknown resources’, i.e. not officially digitized ones. One major problem that cultural innovations, such as Europeana, have to face is “open” and “closed” institutions -meaning their willingness to open their collections to the Web. Initiatives such as OpenGLAM signify the importance of free and open access to cultural heritage and work towards that.

In terms of openness and closedness, it is important for the institutions, museums etc. to take into consideration that “people want to re-use and play with the material, to interact with others and participate in creating something new” (Europeana Strategy 2015-2020, 10). The Rijksmuseum in Amsterdam for example, through its open policy encourages users to download paintings, and/or make their own creations based on their favourite artwork. Of course, the question of copyright seems to be an important issue, an inhibitor, that keeps museums away from establishing open policies: “a lot of the material that institutions want to make available is locked up because of copyright restrictions. Or bound by policies and business models that restrict wider accessibility” (Europeana Strategy 2015-2020, 13).

In conclusion, the aim of this research project is to delve into the characteristics of paintings available on Wikidata(a project also hosted by the Wikimedia Foundation) for use in Wikipedia articles. By looking closely into the metadata provided by Wikidata, the main purpose is to reveal user’s engagement by examining depiction, institutions and countries of origin, openness and/or closeness of institutions, subject and/or languages preferences, sources of the paintings, and most used paintings in Wikipedia in relation to the most popular paintings of an institution. In a broader sense, to what extent are the collections (museum paintings) friendly to Wikipedia users? And how museums can better use Wikipedia in order to improve user’s engagement?

Research Question

What digitized paintings spread on Wikipedia and how are they being used as can be concluded from the metadata that can be retrieved from Wikidata and Wikipedia?

This research question can be subdivided into the following sub-research questions:

● Is the country of origin (of institution / of artists) visible in language preference? (e.g. Rafael in it. and Delacroix in fr.)

● Does an institution’s open/close policy influence its use on Wikipedia?

● Do image files originate from ‘alternative’ sources? (e.g. not a museum)

● Is there a preference for a subject? (e.g. history, geography, religion, pop culture, nudity)



Initially, the research was started and based on the metadata items that were retrieved from Wikidata. Wikidata, a free knowledge base with 15,760,887 data items, was created in 2012 and contains structured data to provide support to Wikipedia, Wikimedia Commons and other projects, such as Wikisource and Wikivoyage. It functions as a repository that permits members of Wikipedia and researchers to reuse and process the data. Moreover, “Wikidata is a document-oriented database, focused around items. Each item represents a topic (or an administrative page used to maintain Wikipedia) and is identified by a unique number, prefixed with the letter “Q” (for example, the item for the topic ‘Politics’ is ‘Q7163’). This enables the basic information required to identify the topic the item covers to be translated without favouring any language” (Wikipedia, “Wikidata”) . The purpose of the Wikidata project therefore is to keep consistency and quality within Wikipedia articles and increase availability of information in smaller language editions. Thus, in terms of this project, the retrieval of data about artworks from Wikidata would allow us to get well structured metadata that can be used further on for research.

Querying Wikidata via the API

At first we queried Wikidata via the API (Application Programming Interface) to get the necessary data for our research. In order to specify a data sample that would fit our research aim and contain sufficient items for a representative research, we limited our study in items that have an entry for the property ‘painting’ in the Wikidata database. It is worth mentioning that in total there are 117,447 items in the Wikidata database that can be identified as ‘painting’. For a specification of our sample and to certify that the data items we get have sufficient metadata for further analysis, we decided not only to retrieve all items with the property ‘painting’, but also to have at least one of the following properties (metadata) as well: ‘wikidata ID’, ‘filename’ (of the image file uploaded to the item, i.e. painting), ‘inception’ (date of the creation of the item, i.e. painting) and ‘depicts’ (tags of the item, that describe the content of the item, i.e. painting). A dataset with 64,707 entries was then retrieved from the Wikidata database that contained the following information (metadata): Table n. 1: wikidata_id, wikidata_url, image_filename, image_url, inception, title, creator, depictions_ids.

Analysis of the depictions of the Wikidata items

From the total number of Wikidata entries (64,707) that were scraped about paintings, 14,240 of them (i.e. a percentage of 22%) had at least one depiction tag ID (e.g. ‘467’). The calculation of the frequency of the different depiction tag IDs in our sample was done by using an Excel formula and ordering them from maximum to minimum. In order to translate the depiction tag IDs into keywords, that could then further be studied, we manually looked up the 100 most used depiction tag IDs in Wikidata by copying the ID (for example ‘467’) and pasting it onto the URL of Wikidata. For example the depiction tag ID ‘467’ refers to the keyword ‘woman’.

Timeline of the five most appearing depictions within Wikidata

We created a graph only for the five most appearing depiction tags: ‘woman’, ‘man’, ‘sky’, ‘tree’ and ‘Mary (mother of Jesus)’. Our decision on choosing these depiction tags was mostly lead by the fact that on the one hand these tags are most often used (meaning that there would be sufficient data for creating a graph for each of these tags), and that on the other hand the tags also represent themes that we found among the depictions. Such themes are for example the depiction of humans (represented by the tags ‘woman’ and ‘man’) versus the depiction of objects (represented by the tags ‘sky’ and ‘tree’). Also, all of these depiction tags might be related to Christian elements which was another strong theme that unfolded among the depiction tags. As for the creation of the graph, it seemed proper to make a new table that showed the frequency of appearance of these five depiction tags mentioned above. Then, we matched the depiction tags to the year of creation of the respective painting to which they are attributed (as could be gathered from the metadata “Inception” of each painting). Gathering all that data, we created a graph using ‘Excel’ that shows the change of frequency of the five most used depiction tags over time (from 1400 to 2012).

Connecting Wikidata with Wikipedia

Subsequently, we linked the information extracted from the Wikidata items to the actual Wikipedia articles that feature these items, i.e. paintings. The purpose of this was to study the reach of digitized paintings within wikipedia and thus investigate a possible relation with the access of these paintings to open versus closed institutions. We moreover wanted to know where the original image file visible on Wikipedia comes from, since we wanted to investigate if Wikipedia users upload image files regardless of whether they have legal access to it or not.

Querying Wikipedia: By quering Wikipedia we retrieved the following metadata related to the wikidata items that are marked as ‘paintings’ and have an image file uploaded to them. The table contains 28,613 Wikidata items in total with the property ‘painting’ and an uploaded image file to them. The items are being used in a total of 160,174 Wikipedia articles. Scanning the dataset, we realized that the table contained also entries that referred to Wikidata, Wikisource, Wikiquote, Wikipedia archives and Userpages instead of only Wikipedia pages, i.e. articles.

Table n. 2: wikidata_id, Wikicommons_ID, Picture_Title, Article_Title, Article_Wiki, Article_URL.

Frequency of Wikipedia articles per Wikidata painting: Our next step was to find out which paintings are most often featured in Wikipedia articles. Therefore, we first sorted the table we retrieved from Wikipedia according to the Wikidata ID since each ID represents a painting. We sorted it top-down, so that the most used Wikidata_IDs (i.e. paintings) are listed on the top of the table. Based on this analysis we got a top 100 list of the most appearing Wikidata paintings in Wikipedia articles.

Wikipedia language version per painting: In order to retrieve information about the different Wikipedia article language versions the paintings exist in, we used the same list of the top 100 paintings most featured in Wikipedia articles. We chose an online open source software tool for working with the data, ‘OpenRefine’ (formerly Google Refine) to count the different language versions for each of the top 100 paintings with the most Wikipedia articles. We then chose the first 20 paintings and sorted them according to the count of language versions the articles exist in, the paintings featured in the most languages on top. For example the self-portrait by Eugene Delacroix ( 1837) appears in 10 different languages. It is most often featured in French articles (see also ‘Findings’). The purpose of this analysis was to explore the reach of the different paintings across different Wikipedia language versions and therefore the scale of their internationalization and cultural spread.

Frequency of Wikipedia articles per Wikidata painting: Our next step was to find out which paintings are most often featured in Wikipedia articles. Therefore, we first sorted the table we retrieved from Wikipedia according to the Wikidata ID since each ID represents a painting. We sorted it top-down, so that the most used Wikidata_IDs (i.e. paintings) are listed on the top of the table. Based on this analysis we got a top 100 list of the most appearing Wikidata paintings in Wikipedia articles.

Wikipedia language version per painting: In order to retrieve information about the different Wikipedia article language versions the paintings exist in, we used the same list of the top 100 paintings most featured in Wikipedia articles. We chose an online open source software tool for working with the data, ‘OpenRefine’ (formerly Google Refine) to count the different language versions for each of the top 100 paintings with the most Wikipedia articles. We then chose the first 20 paintings and sorted them according to the count of language versions the articles exist in, the paintings featured in the most languages on top. The purpose of this analysis was to explore the reach of the different paintings across different Wikipedia language versions and therefore the scale of their internationalization and cultural spread.

Matching the paintings most used in Wikipedia articles to the institutions the physical painting belongs to: Among our research interests was also to find out whether there is a link between the appearance of paintings in Wikidata/Wikipedia and the institutions they belong to and the policy they hold in regard to openness vs. closedness. Quering again Wikidata via the API we retrieved a table containing the following metadata for all items in Wikidata that have the property ‘painting’ and that have an image file attached to them:

Table n. 3: wikidata_ID, institution_ID, institution

Subsequently, we matched this table with the list of the top 100 paintings appearing in Wikipedia articles based on their Wikidata_ID. To define if the institution the respective painting belongs to has an open or a closed policy in regard to publishing the digitized versions of their collections, we manually went through the list and categorized the institutions. This categorization was made based on the experience of Europeana working with these institutions; We were however not able to categorize all of the institutions since Europeana only collaborates with institutions in Europe and our database represented paintings for example also located in the United States of America. There were image files uploaded to the Wikidata entries for paintings, even if the painting belongs to an institution with a closed policy, therefore we wanted to further research where these images files come from. We used the information of the museums to create a pivottable to count the frequency of the appearance of the different museums and ordered them top-down.

Image file sources of the paintings used the most in Wikipedia: In order to find out whether the image files used in Wikipedia, i.e. Wikidata, come from the institutions who digitized the respective painting or if the image file comes from another source, we conducted another manual research. We manually looked up every image file of the 100 most used paintings in Wikipedia by copying the Wikidata_ID (for example ‘29530’) of the respective image from table 2 and pasted it onto the URL of Wikidata . We then clicked on the image file where additional metadata about the source of the image file is stored. We saved this information in a table and then used a pivottable to count the frequency of the appearance of the different sources and ordered them top-down.

Analysis of the depictions of the top 100 paintings appearing in Wikipedia articles: An interesting task would also be to know how the top 100 most used paintings relate to the total paintings stored in the Wikidata database in terms of their depiction tags. Hence, we went manually through the top 100 list of most featured paintings in Wikipedia and extracted the Wikidata_ID of the paintings. A new table was created which contained these Wikidata_IDs (copy-paste procedure). Our next task was to search manually for each of these Wikidata_ID in Table_1, copied the depiction IDs that belong to the respective Wikidata_ID entry and pasted it into the new table. After finishing this, we “translated” the depiction IDs in the new table into the respective keywords we got from the first analysis of the depictions and counted their total frequency. The table below shows the information (metadata) we managed to have:

Table n. 4: Wikidata_ID, Depiction_ID, tags, Frequency of tags.

In order to better compare these results with the depictions counted on the basis of table n. 1 (i.e. the first ‘word cloud’ we created), we created a ‘word cloud’ from this data using the service from the online tool “wordle” (see ‘Findings’).

Analysis of the subjects paintings in Wikidata : Analysing subjects of the paintings that appear in Wikidata would give us an overall view of the paintings and their depictions. An Excel file was created showing three columns, ‘labels’, ‘count of wikidata painting’ and ‘label string’.

Project 4: #Museumselfie

subgroupmembers: Xeniya Kondrat, Ka-Tjun Hau, Alexandra van Ditmars, Melanie Zierse, Fabian de Bont


Engagement in museums is not a new at all, but digital engagement focused on social interactiveness to bring visitors and artworks together is something of the last years. Via the various social media channels, museums try to engage their visitor with the museum and various pieces of art. Museums are exploring these options. There is still more time necessary, so museums can embrace and explore the digital methods that appear with the rise of sociale media usage.

The concept of citizen curators
One of the problems of social media-usage in museums is the way collections and exhibitions are assembled. The curator of a museum is the “assembler of many voices,” said Eric Johnson (Jefferson Library, Monticello) about what the curatorial voice in an age of social media should be.

What does that mean for the involvement of citizens in the process of curating a collection in a museum? And what should the role of social media be?

Neal Stimler, from Metropolitan Museum of Art, points at the point of expertise. He says that not everybody could be a curator - you have to study for it. Experts are more trustworthy and have a critical mind:

“While scholars and museum visitors contribute to the enrichment of curatorial practice through a social media dialogue, I do not share the view that using social media makes everyone a curator. Curators are the most trusted art experts, whose aggregated knowledge, critical thinking abilities, and aesthetic observations define the meaning and value of art.”

Meanwhile there is a shift to democratizing collections. Not that curators will become unnecessary, but they should take the public voice in account. That means not only they should scan the social media - Facebook, Twitter, Instagram, Youtube etc. - but that they should actively engage the audience, by setting up initiatives.

The #museumselfie is one of the initiatives tot engage visitors in the museums with art in a fun way. The hashtag is mostly used on Twitter and Instagram. It is initiated by the website of the American culture-watcher Mar Dixon. His aims is ‘to make museums both less haughty and more physical, as it would encourage a particular type of bending and darting to see the art around people's posturin' and posin’.’

The first Museumselfieday was in 2014. This year, on January 20th, the third edition will be held worldwide.

Not everyone is happy about the museumselfieday. That has to do with two reasons. First of all, some museums ban the use of selfies, whereas other museums encourage it. Last year the Metropolitan Museum of Art in New York banned the usage of the selfiestick to discourage the making of selfies.

There are also people who think that by making selfies in museums, artwork will lose theire value. The otherwise liberal newspaper The Guardian wrote in a column that “people in art galleries should look at art and not take pictures of themselves in front of it”. It went even as far that the Chair of Arts Council England regards selfies as a very big problem. They even suggested that art galleries should impose an everyday hour selfie ban.

Sinds the 2010s there are more and more positive viewpoints of the selfie-engagement. Alli Burness, curator in various Australian museums, wrote a master's thesis about selfies at museums. She says:

“Museum visits and photography are performative, a kind of performance. We use them as a means to create our identity. A museum is a kind of theatre. When we view an exhibit in a museum we assume a role, which we also do when we pose to have our picture taken.”

In Sweden museum selfies are welcomed as well. There was even a how-to-make-a-selfie workshop in the National Museum of Fine Arts in Stockholm as part of the exhibition 'Selfies – Now and Then’. Margareta Gynning, curator at that museum says that selfies and portraits are both rooted in the need for contact. She argues that museums can use selfies as a tool for discussing photographs and art. She claims:

“We only exist in our interaction with others and we want to be acknowledged. It is a basic human need, which is why I believe that banning selfies would be completely absurd.”

So basically there are two schools of thought. On the first hand museums who embrace en encourage the use of the museumselfie. On the other hand there are museums who want to ban the use of museumselfies. However, over time we see a shift whereby museums embrace museumselfies more and more. And - interesting development - museums use museumselfies for collections, exhibitions and to engage visitors.

In that light is the research about the selfie-curated museum on Instagram very useful for new developments in digital engagements and museums.

Research question(s)

How is the #museumselfie practiced on Instagram? And how does it affect the 'digital engagement' with artworks in museum?

Should museums should support visitors taking selfies in the museum?

How do museums encourage them? And are they successful? And how does this matter?

Iss there a correlation between taking selfies and the artworks?


The data for this research project was retrivied from Instagram posts with the Instagram Hashtag Explorer. For this, the query #museumselfie was used and the number of iterations set to the maximum (1000) with options of getting user's infos and thumbnails of the photographs. These settings resulted in 12,327 Instagram posts with #museumselfie.

Within this data, we set out the top 9 museums that have been associated with #museumselfie highest co-hashtag connections by using Gephi visualization and data laboratory (see Figure 1 in Results and Visualizations). Afterwards, the top 9 museums were explored manually, where a selection of the artworks, which have had appeared most, were analyzed. Adobe Illustrator was used to create an illustration, which shows the increasing usage of the #museumselfie per museum with the examples of the most used artworks together with this hashtag on Instagram (see Figure 2).

In order to map out the 'semantic sphere' of the co-hashtags, we used Gephi to visualize the the correlation between co-hashtags in relation to #museumselfie (see Figure 1).

Google Fusion Tables enabled us to create a 'feature map' to do an exploratory analyses of the Instagram posts that contained geotags. 4,811 of 12,322 Instagram posts with #museumselfie contained geotags. Google Fusion Tables were also used for identification of the most popular artworks for selfies within the museums. This search was performed manually by filtering the results of the table per museum and counting the artworks. We used CartoDB to make an extended 'heat map' of approximately 4,812 geotags (see Figure 4).

Additional research was done to search for correlations between the top 9 museums and #selfie. The same settings, as for #museumselfie, were used to extract the data by using Instagram Hashtag Explorer. Later Gephi was used to analyze the data. RAW website together with extracted data were used to create an illustration for a demonstration of the results (see Figure 3).


Google Fusion Tables limited our presentation of the locations on the heat map. The limit is 1000 geolocations, when our research contains 4,811 geolocation tags. Therefore, another tool CartoDB was used to create a full descriptive heat map.

Originally, the top 10 list of the museums was created including Dutch branch of Hermitage Amsterdam as the last one (according to the retrieved results). During the analysis of the results, the research team found that the most of the results belonged to State Hermitage Museum, Saint-Petersburg, Russia. Only two results out of 38 were related to Dutch branch of the museum. Thus, it was decided to exclude the last result and focus on the top 9.

Finally, the prelimiary research on Instagram showed that there are 15,652 results for the #museumselfie query. However, it was possible to extract only 12, 327 results with Instagram Hashtag Explorer with the max iterations. The difference in the results is most likely caused by the videos, which are not captured by the tool.

Results & visualizations

Below the Gephi visualization of the co-hashtags of #museumselfie is depicted, see figure 1. In this visualization we highlighted (in pink/purple) the ‘Top 9 museums’ that have been appeared the most with the #museumselfie.


Figure 1 Gephi visualization of #museumselfie co-hashtags with the selection of Top 9 Museums

We can see the nodes ‘museum’, ‘selfie’, and ‘art’ are most connected with the central node ‘museumselfie’. The clusters that appeared at the top and bottom left from ‘museumselfies’ are hashtags that appear frequently together with #museumselfie. These clusters contain co-hashtags that often have been used and copied together in Instagram posts. The top left cluster contain hashtags that have strong connection with the node ‘contemporary art’. Interesting is the bottom left one, these clusters are related to the ‘Expo 2015’, a universal exhibition that was hosted by Milan in Italy last year.

Below is a list of the top 9 museums. In addition we labeled the museums with ‘classical’ or ‘modern’. ‘Modern’ indicates that the museum’s collection is considered modern art, and ‘classical’ indicates a more classical collection.

The top 9 museum are:

1. #moma, Museum of Modern Art, New York. (modern)

2. #metmuseum, The Metropolitan Museum of Art, New York. (classical)

3. #louvre, The Louvre Museum, Paris. (classical)

4. #miaqatar, Museum of Islamic Art, Doha. (classical)

5. #britishmuseum, The British Museum, London. (classical)

6. #guggenheim, The Guggenheim Museum Bilbao, Bilbao. (modern)

7. #lacma, Los Angeles County Museum of Art, Los Angelas. (modern)

8. #hermitage, The State Hermitage Museum, St. Petersburg. (classical)

9. #whitneymuseum, Whitney Museum of American Art, New York (modern)


Figure 2 Top 9 museums within #museumselfie and the most popular artworks for #museumselfie

After finding out which museums are the most popular within #museumselfie, the following research was done to explore the co-hashtag correlation between these museums and #selfie. Below the visualization shows the proportial correlation of the museums to the #selfie and presents that Museum of Louvre has the highest relation with #selfie, where it was mentioned 339 times. The full list of results is:

Museum of Louvre - 339 nodes of #selfie

Los Angeles County Museum of Art - 175 nodes of #selfie

Museum of Modern Art - 160 nodes of #selfie

British Museum - 135 nodes of #selfie

Whitney Museum - 128 nodes of #selfie

Guggenheim Museum Bilbao - 115 nodes of #selfie

Государственный Эрмитаж - 97 nodes of #selfie

Metropolitan Museum of Art - 87 nodes of #selfie

Museum of Islamic Art - 24 nodes of #selfie


Figure 3 Circle packing cluster with #selfie and Top 9 Museums

The heat map, which was created by using extracted data and CartoDB website, shows the locations with the highest usage density of #museumselfie. The darker red color gets, the higher is the usage of the hashtag. The dark red spots indicate that the concetration of #musemselfie usage on Instagram is on the East Coast of the USA (New York), West Coast of the USA (Los Angeles), Europe (Paris and London), North-West of Russia (Saint-Petersburg) and Middle East (Qatar, Doha). Additionally, the heat map shows other locations worldwide were #museumselfie was used, but with less count.


Figure 4 Heat map indicating the highest points of #museumselfie usage

Findings & conclusions

After analyzing which museums used the hashtag #museumselfie the most, we manually checked which artworks are the most popular in these museums to make a selfie with. Based on the data, this brought us to the following list:

1. The Museum of Modern Art (New York)
Fulang-Chang and I – Frida Kahlo

The portrait Fulang-Chang and I is made by Frida Kahlo and is exhibited at The Museum of Modern Art, New York. The artwork in itself is not what attracts the visitors to take a selfie. It is the mirror hanging right next to the portrait. The mirror is framed by a look-a-like the portrait’s frame, allowing the visitor to create a visual experience in which they can be the protagonist in their ‘own’ Frida Kahlo, right next to the original artwork.

2. Metropolitan Museum (New York)
Wall Drawing #370 – Sol Lewitt

In the Met Museum most people take a selfie with the Wall Drawing #370 made by the artist Sol LeWitt. This huge artwork consists of ten different blocks of black and white stripes geometrically shaped and covers the whole wall. Most people take a picture in front of one of the ten blocks, e.g. a triangle, cross, X and diamond, therewith cropping the artwork. The museum itself also posted a selfiepicture of one of their new staff members in front of it and wrote in the caption that the Wall Drawing #370 is the perfect museum selfie spot.

3. The Louvre Museum (Paris)
Mona Lisa – Leonardo da Vinci

One of the most famous paintings by Leonardo da Vinci, Mona Lisa, is also the most selfie attracting artwork in the Louvre in combination with the #museumselfie. In comparison with other results, Mona Lisa was an evident leader among the most popular artworks at the Louvre Museum and had the highest count of the selfies taken with it.

4. The Museum of Islamic Art (Qatar)

Fancy jewels (different pieces)

In the Museum of Islamic Art there isn’t one specific artwork that stands out in selfie-popularity. Pictures with different pieces of jewelry are the most photographed inside the building and all made by women. Additionally, most of the selfies are taking in front of the museum building or in the surroundings of it.

Sidenote: The Museum of Islamic Art actively promotes #museumselfie on its Instagram page.

5. British Museum (London)
Hoa Hakananai'a a.k.a Dum Dum – Easter Island Statue

Although taking a museumselfie outside of the British Museum is very popular, people also take pictures with artworks inside of the museum. The Easter Island statue nicknamed Dum Dum is the most popular artwork to take a selfie with.

6. The Guggenheim Museum Bilbao (Bilbao)
Tulips – Jeff Koons

Jeff Koons’ Tulips are exhibited both outside and inside of the Guggenheim Museum Bilbao. This large, colourful, shiny work of art is the most popular to take a selfie with inside and not with a stretched arm, but reflected in the artwork itself.

7. Los Angles County Museum of Art (Los Angeles)
Faces of America (exhibition)

LACMA has built a special installation around a mirror where people could take a selfie in. It was part of the exhibition Faces of America. Many people took this opportunity to snap a selfie and share it with their Instagram and #museumselfie.

8. State Hermitage Museum (St. Petersburg)
Madonna Litta - Leonardo da Vinci

The data on State Hermitage Museum did not present persistent usage of the particular artwork for the #museumselfie. However, Madonna Litta has two mentions out of 38 thus is considered the most used artwork for #museumselfie and taking selfies.

9. Whitney Museum (New York)
Easyfun (exhibition) – Jeff Koons

Jeff Koons is also popular at the Whitney Museum. Most of the selfies are made as reflections in one of his artworks from the exhibition Easyfun, which consists of big, reflective, coloured organic shapes hanging opposite to each other.

In conclusion, based on the presented results of the research, couple of statements can be made:

1. In overall, modern museums and their artworks are more popular to take a selfie with and put it on the Instagram by using #museumselfie and/or #selfie. Nevertheless, Mona Lisa is an exception in this case since it was one of the most shared artwork,which mentioned #museumselfie and #selfie. Playful, reflective or mirroring artworks seem to attract the visitors to take a selfie with and use #museumselfie for Instagram post.Therefore we can state that there seems to be a correlation between interactive art, that encourages the viewer to interact with it, and the likelihood for people to make a museum selfie with it. Jeff Koons artworks with their shining and reflective surfaces work best for encouraging the visitors to take a selfie.

2. Instagram is a powerful tool for museums to address a wide and young public and accomplish an 'interplay' between the visitors and artworks. This interplay consist of a process of appropriation and reinterpretation. Wherein artworks take other forms by adapting filters, cropping, captions and hashtags. This form of engagement through taking selfies, can be further encouraged by supporting the #museumselfie campaign and promoting it in the museum and its openness towards selfieculture. By doing this, a wide and young public can be addressed and attracted to visit the museums.


  • Decker, J. (2015) Engagement and Access: Innovative Approaches for Museums: Innovative Approaches for Museums Rowman & Littlefield.
  • Kidd, J. (2011) "Enacting engagement online: framing social media use for the museum", Information Technology & People, Vol. 24 Iss: 1, 64 - 77
  • Proctor, N. (2010), Digital: Museum as Platform, Curator as Champion, in the Age of Social Media. Curator: The Museum Journal, 53: 35–43.
  • Tifentale, A., & Manovich, L. (2014). Selfiecity: Exploring Photography and Self-Fashioning in Social Media.
I Attachment Action Size Date Who Comment
12516031_1083957191639159_223166785_n.jpgjpg 12516031_1083957191639159_223166785_n.jpg manage 145 K 15 Jan 2016 - 18:21 MathildeGrenod  
12516633_1083932518308293_770123753_o.jpgjpg 12516633_1083932518308293_770123753_o.jpg manage 140 K 15 Jan 2016 - 18:14 MathildeGrenod  
12562650_1083932281641650_1705736763_o.jpgjpg 12562650_1083932281641650_1705736763_o.jpg manage 124 K 15 Jan 2016 - 18:15 MathildeGrenod  
Bubbles_of_top_9_museums_in_comparison_to_usage_of_selfie.pngpng Bubbles_of_top_9_museums_in_comparison_to_usage_of_selfie.png manage 484 K 15 Jan 2016 - 08:57 XeniyaKondrat Cluster of museums and #selfie
Museumselfie_Top_9_Museums_Bubbles_Timeline.pngpng Museumselfie_Top_9_Museums_Bubbles_Timeline.png manage 3 MB 15 Jan 2016 - 09:13 XeniyaKondrat #museumselfie Top 9
New_Heat_Map.pngpng New_Heat_Map.png manage 201 K 15 Jan 2016 - 11:25 XeniyaKondrat  
New_Heat_Map_Points.pngpng New_Heat_Map_Points.png manage 125 K 15 Jan 2016 - 11:34 XeniyaKondrat  
New_Heat_Map_Points_with_the_bar.pngpng New_Heat_Map_Points_with_the_bar.png manage 192 K 15 Jan 2016 - 11:37 XeniyaKondrat  
Screen_Shot_2016-01-15_at_13.35.57.pngpng Screen_Shot_2016-01-15_at_13.35.57.png manage 75 K 15 Jan 2016 - 18:02 MathildeGrenod Rijksmuseum first two queried artworks on Europeana's website for 2015
Screen_Shot_2016-01-15_at_13.36.08.pngpng Screen_Shot_2016-01-15_at_13.36.08.png manage 62 K 15 Jan 2016 - 18:04 MathildeGrenod National Museum of Sweden's most two queried artworks on Europeana's website for 2015
Screen_Shot_2016-01-15_at_13.36.20.pngpng Screen_Shot_2016-01-15_at_13.36.20.png manage 44 K 15 Jan 2016 - 18:07 MathildeGrenod  
Screen_Shot_2016-01-15_at_13.36.30.pngpng Screen_Shot_2016-01-15_at_13.36.30.png manage 62 K 15 Jan 2016 - 18:08 MathildeGrenod  
Screen_Shot_2016-01-15_at_13.36.41.pngpng Screen_Shot_2016-01-15_at_13.36.41.png manage 84 K 15 Jan 2016 - 18:09 MathildeGrenod  
Screen_Shot_2016-01-15_at_13.36.55.pngpng Screen_Shot_2016-01-15_at_13.36.55.png manage 67 K 15 Jan 2016 - 18:11 MathildeGrenod  
Screen_Shot_2016-01-15_at_15.34.47.jpegjpeg Screen_Shot_2016-01-15_at_15.34.47.jpeg manage 629 K 15 Jan 2016 - 18:18 MathildeGrenod  
Screen_Shot_2016-01-15_at_15.34.47.pngpng Screen_Shot_2016-01-15_at_15.34.47.png manage 629 K 15 Jan 2016 - 14:36 MathildeGrenod  
cuttop9.pngpng cuttop9.png manage 5 MB 15 Jan 2016 - 10:10 XeniyaKondrat  
top9.pngpng top9.png manage 5 MB 15 Jan 2016 - 10:09 XeniyaKondrat  
Topic revision: r21 - 26 Feb 2016, RichardRogers
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback