# How to create a word cloud in Python

Word clouds are a nice way to visually summarize what the text is about, this shows how to create them programmatically in python.

### Steps

For creating a rectangular Wordcloud see the official documentation.

For creating a masked Wordcloud programmatically, a python script is required. Here is an example script

#!/usr/bin/env python
"""
================

Using a mask you can generate wordclouds in arbitrary shapes.
"""

from os import path
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import os
import csv
from operator import itemgetter

from wordcloud import WordCloud, STOPWORDS

# get data directory (using getcwd() is needed to support running example in generated IPython notebook)
d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()

stopwords = set(STOPWORDS)

wc_rect = WordCloud(background_color="white", max_words=500, width=3000,
height=1500, stopwords=stopwords, min_font_size=2,
contour_width=3, contour_color='black')
wc_rect.generate(text)
wc_rect.to_file(path.join(d, "wc-rectangle.png"))

wc = WordCloud(background_color="white", max_words=1000, width=2000,
min_font_size=2, contour_width=3, contour_color='black')

wc.generate(text)


Let’s call this script make-masked-wordcloud.py.

A map is a black-white image that is used to place the words, in this example the map is this:

Let’s name this file “map.png” so that the above script works.

For the input text I’ll just copy the text from this article into a .txt file.

With the “input.txt”, “map.png” we can call the make-masked-wordcloud.py like this

?> python make-masked-wordcloud.py


This results in two word clouds, a rectangle and the masked one

The word “method” could be removed from the input.txt, as well as et. al., but that’s not important to know how to use the module.

Everything used to generate the wordcloud is available in word-cloud.tgz (requires access to the SFB’s Confluence).