Click on a heading below to show only that medium. Percentages are a 10-year rolling average.

Sources and Methods

Data for this project is from the Williams College Museum of Art and is available here. It has 541 entries and includes all sculptures in the collection.

I used the Python library pandas to clean and prepare the data for this project. First I loaded the spreadsheet:


                import pandas as pd
                import regex as re

                df = pd.read_csv("wcma-collection-SCULPTURE.csv")

Before I made the final chart, I wanted a sense of what the most common mediums in the dataset were. But the medium column was really inconsistently formatted, so I had to do some cleanup first. When a sculpture used multiple mediums, sometimes they were separated with |, sometimes with a comma, and sometimes just with spaces. The following code is my best attempt at normalizing that. If you don't aren't familiar with Python or regex, what I did is split the string on on | and , characters, then removed spaces and question marks. I stored the result in a new mediums column and then counted to see what the most common values were. It's not perfect, but it gave me a sense of what was going on.


                from collections import Counter
                from itertools import chain

                def splitter(string):
                    if not isinstance(string, str):
                        return []

                    return list(map(
                        lambda s: s.strip().strip("?"),
                        re.split(r"[\|,]|and", string)
                    ))

                df["mediums"] = df.medium.map(splitter)

                mediums = Counter(chain(*df.mediums))
                mediums.most_common(10)

Running it yielded this:


                [('bronze', 98),
                ('wood', 48),
                ('mixed media', 27),
                ('metal', 20),
                ('steel', 17),
                ('glass', 16),
                ('paint', 14),
                ('paper', 12),
                ('limestone', 12),
                ('wire', 12)]

Lot of bronze and wood, which isn't too surprising. I decided to pick bronze, wood, steel, glass and paper to chart for no particular reason other than I thought they're all different.

To get the values I used to actually make the chart, I ran the following code. It's pretty clunky and there's probably a more slick way to do this, but the dataset was small so I didn't worry too much.


                window_size = 10

                mediums = ["Bronze", "Wood", "Steel", "Glass", "Paper", "Metal"]
                
                data = []
                
                for y in range(1900, df.creation_date_latest.max()):
                    row = [y]
                
                    # all the artwork whose range of possible creation
                    # dates overlaps the moving date window
                    in_year_window = df[
                        (df.creation_date_earliest <= y)
                        & (df.creation_date_latest > y - window_size)
                    ]
                
                    n = len(in_year_window)
                
                    for medium in mediums:
                        is_of_medium = (
                            in_year_window.medium
                            .str.contains(medium.lower())
                            .fillna(False)
                        )
                
                        # filter the artwork further to include only sculptures made
                        # with the specific medium
                        m = len(in_year_window[is_of_medium])
                
                        row.append(m / n * 100)
                    
                    data.append(row)

Then I put the data into flourish and made the chart.

I made this website using a HTML and CSS framework called Bootstrap. The code highlighting is done with a library called prism.js.

Conclusions

The most obvious pattern in the data is that bronze statues dominate the collection for the first half of the 20th century before dropping off and being replaced by a recent surge in the popularity of wood. I don't know if this is because bronze statue-making has become less popular or because merely because only bronze statues from that era have survived. It's a much more durable material than wood, glass, or paper.

There are some other less-obvious patterns too. Steel saw a surge in the mid-century before dropping off around 2000. Recently it's come back again. Paper was the most common around the turn of the century. The popularity of glass is very erratic, probably because there aren't enough statues in the collection for a pattern to emerge. None of this would have been visible without using these sorts of quantitative techniques.

I don't know why these patterns exist, or if they're indicative of larger trends in the art world. I don't know enough about art history to postulate. But regardless, I'm happy with how everything turned out and I hope someone with more knowledge on the subject can comment on why we see the patterns we do.