Roll simultaneously a hundred 100-sided dice, and add the resulting values. The set of possible outcomes is, of course, $latex \{100, 101, \dotsc, 10000\}.$ Note that there is only one way to obtain the outcome 100—all dice showing a 1. There is also only one way to obtain the outcome 10,000—all dice showing 100. For the outcome 101, there are exactly 100 different ways to obtain it: all dice except one show a 1, and one die shows a 2. The follow-up question is:

Write a (fast) script that computes the different ways in which we can obtain each of the possible outcomes.

The obvious first strategy is to come up with a function of three variables $latex (v=\text{value}, n=\text{number of dice}, s=\text{sides per die})$ that expresses the number of different ways to obtain the value $latex v$ rolling $latex n$ $latex s-$sided dice. We may re-write this function iteratively in the following fashion:

$latex f(v,n,s)$ $latex = \displaystyle{\sum_{k=1}^s} f(v-k, n-1, s) = \displaystyle{\sum_{k_1=1}^s \sum_{k_2=1}^s} f\big( v-k_1-k_2, n-2,s)$
$latex = \displaystyle{\sum_{k_1=1}^s \sum_{k_2=1}^s \dotsb \sum_{k_{d-1}}^s} f(v-k_1-\dotsb-k_{d-1}, 1, s\big)$

with the convention that $latex f(v,1,s)=0$ provided $latex v>s$ or $latex v<1,$ and of course $latex f(v,1,s)=1$ for $latex v \in \{1,\dotsc,s\}.$

A simple code (in ruby) that takes advantage of this strategy could be written as follows:



def vwnd(value,dice,sides)
  if value <  1
    return 0
  elsif dice==1 && value > sides
    return 0
    for step in 1...sides+1
    return collect

for value in idice...idice+(idice*isides-idice)/2+1
    stuff = vwnd(value,idice,isides)
    print value, " and ", idice*isides-value+idice, " with ", isides, "-sided ",
idice, " dice: ", stuff, "\n"

But this is not the best way to attack this problem: this script chokes even with relative small choices of number of dice and sides per die. A different approach to the functional relationship above offers a better chance. The focus is on the second expression:

$latex f(v,n,s) =$ $latex \displaystyle{\sum_{k_1=1}^s \sum_{k_2=1}^s} f\big( v-k_1-k_2, n-2,s)$
$latex = $ $latex f(v-2,n-2,s) + f(v-3,n-2,s) + \dotsb + f(v-1-s,n-2,s)$
$latex +$ $latex f(v-3,n-2,s) + f(v-4,n-2,s) + \dotsb + f(v-2-s,n-2,s)$
$latex \vdots$
$latex +$ $latex f(v-s-1,n-2,s) + f(v-s-2,n-2,s) + \dotsb + f(v-2s,n-2,s)$

The term $latex f(v-2,n-2,s)$ appears exactly once; the term $latex f(v-3,n-2,s)$ appears exactly twice; the term $latex f(v-4,n-2,s)$ appears exactly three times. You start seeing now how this can work in our advantage, right? Let me make it more clear with a small example:

  • 1 dice with 6 faces each: There are six possible outcomes, and each of them can be obtained in a single way.

    $latex \begin{array}{r}\text{outcomes:}\\ \text{ways to obtain it:}\end{array} \begin{array} {|c|c|c|c|c|c|}\hline 1&2&3&4&5&6\\ \hline 1&1&1&1&1&1\\ \hline\end{array}$

  • 2 dice with 6 faces each: There are eleven possible outcomes, and each of them can be obtained in different ways:

    $latex \begin{array}{r}\text{outcomes:}\\ \text{ways to obtain it:}\end{array} \begin{array}{|c|c|c|c|c|c|c|c|c|c|c|} \hline 2&3&4&5&6&7&8&9&10&11&12\\ \hline1&2&3&4&5&6&5&4&3&2&1\\ \hline\end{array}$

    A strategy to obtain this result from the previous is by collecting the previous array of "ways to obtain the outcomes"—in this case, the vector $latex \big(1,1,1,1,1,1\big)$—and creating the following matrix of size $latex (6+5) \times 6$ from it:

    $latex \left(\begin{array}{ccccccccccc} 1&1&1&1&1&1&0&0&0&0&0\\ 0&1&1&1&1&1&1&0&0&0&0\\ 0&0&1&1&1&1&1&1&0&0&0\\ 0&0&0&1&1&1&1&1&1&0&0\\ 0&0&0&0&1&1&1&1&1&1&0\\ 0&0&0&0&0&1&1&1&1&1&1\end{array}\right)$

    Add all the elements on each column, to obtain the desired result. Do you see now how to iterate this process?
  • 3 dice with 6 faces each: We start with the previous result—the array $latex \big( 1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1\big)$—and add the elements of the columns of the matrix of size $latex (11+5) \times 6$ below:

    $latex \left( \begin{array}{cccccccccccccccc} 1&2&3&4&5&6&5&4&3&2&1&0&0&0&0&0\\ 0&1&2&3&4&5&6&5&4&3&2&1&0&0&0&0\\ 0&0&1&2&3&4&5&6&5&4&3&2&1&0&0&0\\ 0&0&0&1&2&3&4&5&6&5&4&3&2&1&0&0\\ 0&0&0&0&1&2&3&4&5&6&5&4&3&2&1&0\\ 0&0&0&0&0&1&2&3&4&5&6&5&4&3&2&1\\ \end{array}\right)$

    The result is, in this case, $latex \big( 1, 3, 6, 10, 15, 21, 25, 27, 27, 25, 21, 15, 10, 6, 3, 1\big).$

The method that we propose is then as follows:

To obtain a vector that indicates the possible ways to compute the sums of all outcomes of rolling $latex n$ $latex s-$sided dice by iteration of the above procedure. Start with $latex m=1$ die, until reaching $latex m=n$ dice. The initial vector has all ones and length $latex s$, and each successive $latex k-$th step of this algorithm will provide us with a new vector of length $latex s+(k-1)(s-1).$ For the last step $latex (k=n)$ we have, of course, $latex s+(n-1)(s-1) = ns-n+1$ possible outcomes.

A ruby script that uses this idea could look like this:


dice = ARGV[0].to_i
sides= ARGV[1].to_i {1}

def helper(pattern,target,step,size)
  if step==1
    return target
    pattern.each_index{ |x| target[x+1-step+size] += pattern[x] }
    return helper(pattern,target,step-1,size)

def count_permutations(dice,sides,ary)
  if dice==1
    return ary
    return puts count_permutations(dice-1,sides,helper(ary,ary,sides,sides))

puts count_permutations(dice,sides,original_permutation)



It is a great exercise to code this fun script in a language more apt to scientific computing: python, sage, matlab, C/C++, Ocaml, and of course look for tricks to simplify and speed it up. A compiled version of this algorithm should not take much time to obtain the desired result.  You can even go a little bit farther and code a new script that will compute the number of different ways to come up with the possible outcomes of rolling $latex n$ $latex s-$sided dice, where the faces of each die belong in a given set $latex \{F_1, F_2, \dotsc, F_s\}.$ The values $latex F_k$ are allowed to be equal, or non-consecutive… anything, really.

Helper: more than a mere helper

The small recursive function that computes the key step of the algorithm is also the key to other similar counting problems. We called it helper for lack of a better word, and you can use it to aid you in other related tasks. For instance, to compute all the binomial coefficients $latex \binom{k}{n}$ for any $latex n \in \mathbb{N}$ and any $latex 0 \leq k \leq n,$ you could issue:

first_bc = {1}
def binomial_coeffs(n,ary)
  if n==0
    return ary
    return binomial_coeffs(n-1,helper(ary,ary,2,2))

puts binomial_coeffs(n,first_bc)

Solution to our problem

The script presented above computes the required different ways to come up with the $latex 10000-100+1=9901$ possible outcomes of rolling a hundred 100-sided dice… in about a minute! (mileage may vary depending on your computer, of course)

$ time ./justin 100 100
1 100 5050 171700 4421275 91962520 1609344100 24370067800 325949656825 39113958
81900 42634215112710 426342151127100 3943664897925675 33976189889821200 2742363
89824985400 2084196562669889040 14980162794189827475 102217581419177646300 6644
14279224654700950 4126362365711013405900 24551856075980529765105 14029632043417
4455800600 771629762387959506903300
102217581419177646300 14980162794189827475 2084196562669889040 2742363898249854
00 33976189889821200 3943664897925675 426342151127100 42634215112710 3911395881
900 325949656825 24370067800 1609344100 91962520 4421275 171700 5050 100 1

real	1m16.462s
user	1m10.895s
sys	0m1.863s

Fast scripts for computing multiple dice results

While looking for ideas to implement voronoi in sage, we stumbled upon a beautiful paper written by a group of japanese computer graphic professionals from the universities of Hokkaido and Tokyo: A Method for Creating Mosaic Images Using Voronoi Diagrams. The first step of their algorithm is simple yet brilliant: Start with any given image, and superimpose an hexagonal tiling of the plane. By a clever approximation scheme, modify the tiling to become a voronoi diagram that adaptively minimizes some approximation error. As a consequence, the resulting voronoi diagram is somehow adapted to the desired contours of the original image.
(Fig. 1) (Fig. 2) (Fig. 3) (Fig. 4)
In a second step, they manually adjust the Voronoi image interactively by moving, adding, or deleting sites. They also take the liberty of adding visual effects by hand: emphasizing the outlines and color variations in each Voronoi region, so they look like actual pieces of stained glass (Fig. 4).
The following is a simple script in sage to initialize the mosaic. It loads the original image into the array img (Fig. 1), and creates the hexagonal voronoi image newimg (Fig. 2). Each cell on this image has a unique color, and this is obviously chosen so to minimize the $latex L_2$ error by constants:
from numpy import *
import scipy.ndimage
import scipy.misc

# The parameter *scale* modifies the size of the initial hexagonal cells.

# Creation of the initial hexagonal tiling.
# Centers are stored in *lattice*, cells in *voronoi*

for k in range(floor(length/scale/sqrt(3)-0.5)+1):
    for j in range(floor(width/scale/3-0.5)+1):

for k in range(floor(length/scale/sqrt(3)+1)):
    for j in range(floor(width/scale/3+1)):

# Resulting image is stored in the array *newimg*

for k in range(1,voronoi.max()+1):
    rmean = sum(multiply(voronoi==k,rimg))/base
    gmean = sum(multiply(voronoi==k,gimg))/base
    bmean = sum(multiply(voronoi==k,bimg))/base
In the second step, each cell of the voronoi diagram is allowed to change its center to the eight immediate neighborhood pixels. Among the nine possible cells arising (counting the original as well), we keep the one that offers the smallest error. We repeat this second step as many times as necessary, updating the diagram each time accordingly (see Fig. 3). A na├»ve script to realize this task in sage could go like this—note that the algorithm is designed with pedagogical purposes in mind, instead of seeking an optimal code that employes the usual trickery to gain speed and minimize memory allocation:
for k in range(len(lattice)):
    center  = lattice[k]
    xcenter = center[0]
    ycenter = center[1]
    ecenter = center[2]
    errors = zeros((3,3))
    for xoffset in range(-1,2):
        for yoffset in range(-1,2):
            if testxcenter>-1
                     and testxcenter < length
                     and testxcenter > -1
                     and testycenter < width:
                testlattice[k] = [xcenter+xoffset,ycenter+yoffset,ecenter]

                for location in testlattice:
                    cell = testvoronoi[xcenter,ycenter]
                    error = 0.0
                    for step in range(img.shape[2]):
                        errors[xoffset+1,yoffset+1] = sqrt(error)
    minerror = errors.min()
    minerrorposition = errors.argmin()
    xshift = minerrorposition//3 -1
    yshift = minerrorposition%3 -1
    newxcenter = xcenter+xshift
    newycenter = ycenter+yshift
    if newxcenter > -1
            and newxcenter < length
            and newycenter > -1
            and newycenter < width
            and minerror < ecenter:
It is very desirable to design a third step that will actually accomplish the task of imposing the real contours of the image, and enhances them accordingly. In a future post we will present some ideas in this direction, based on two techniques: "constrained Voronoi diagrams", and "optimal subdivision of cells by linear regression."

Voronoi mosaics

Recall the First Spherical Law of Cosines:

Given a unit sphere, a spherical triangle on the surface of the sphere is defined by the great circles connecting three points $latex u$, $latex v$, and $latex w$ on the sphere. If the lengths of these three sides are $latex a$ (from $latex u$ to $latex v),$ $latex b$ (from $latex u$ to $latex w),$ and $latex c$ (from $latex v$ to $latex w),$ and the angle of the corner opposite $latex c$ is $latex C,$ then $$\cos c = \cos a \cos b + \sin a \sin b \cos C$$

In any decent device and for most computer languages, this formula should give well-conditioned results down to distances as small as around three feet, and thus can be used to compute an accurate geodetic distance between two given points in the surface of the Earth (well, ok, assuming the Earth is a perfect sphere). The geodetic form of the law of cosines is rearranged from the canonical one so that the latitude can be used directly, rather than the colatitude, and reads as follows: Given points $latex p_1$ and $latex p_2$ with positions $latex (lat_1, long_1)$ and $latex (lat_2, long_2)$ respectively, the distance $latex d$ between the two points is given by the following formula.

$$ \cos\displaystyle{\frac{d}{R_\oplus}}=\sin(lat_1)\sin(lat_2) + \cos(lat_1)\cos(lat_2)\cos(long_2-long_1),$$

where $latex R_\oplus=3,959$ is the radius of the Earth in miles (well, ok, the average radius of the Earth…)

A nice application of this formula can be used for geolocation purposes, and we recently had the pleasure to assist Thumb Mobile to write such functionality for one of their clients.

Go to in your mobile device, and click on "Find a Location." This fires up the location services of your browser. When you accept, your latitude and longitude are tracked. After a fast, reliable and resource-efficient algorithm, the page offers the location of the restaurant from the Lizard's chain that is closest to you. Simple, right?


In this post, we would like to show how to use  a few different features of numpy, scipy and matplotlibs to accomplish a few basic image processing tasks: some trivial image manipulation, segmentation, obtaining of structural information, etc.  An excellent way to show a good set of these techniques is by working through a complex project.  In this case, we have chosen the following:

Given a HAADF-STEM micrograph of a bronze-type Niobium Tungsten oxide $latex Nb_4W_{13}O_{47}$ (left), find a script that constructs a good approximation to its structural model (right).
Courtesy of ETH Zurich

For pedagogical purposes, we took the following approach to solving this problem:

  • Segmentation of the atoms by thresholding and morphological operations.
  • Connected component labeling to extract each single atom for posterior examination.
  • Computation of the centers of mass of each label identified as an atom. This presents us with a lattice of points in the plane that shows a first insight in the structural model of the oxide.
  • Computation of Delaunay triangulation and Voronoi diagram of the previous lattice of points. The combination of information from these two graphs will lead us to a decent (approximation to the actual) structural model of our sample.

Let us proceed in this direction:

Needed libraries and packages

Once retrieved, our HAADF-STEM images will be stored as big matrices with float32 precission. We will be doing some heavy computations on them, so it makes sense to use the numpy and scipy libraries. From the first module we will retrieve all the functions; from the second, it is enough to retrieve the tools in the scipy.ndimage package (multi-dimensional image processing). The procedures that load those images into matrices, as well as some really good tools to manipulate plots and present them are in the matplotlib libraries; more concretely, matplotlib.image and matplotlib.pyplot. We will use them too. The preamble then looks like this:

from numpy import *
import scipy.ndimage
import matplotlib.image
import matplotlib.pyplot

Loading the image

The image is loaded with the command matplotlib.imread(filename). It stores the image as a numpy.ndarray with dtype=float32. Notice that the maxima and minima are respectively 1.0 and 0.0. Other interesting information about the image can be retrieved:

print “Image dtype: %s”%(img.dtype)
print “Image size: %6d”%(img.size)
print “Image shape: %3dx%3d”%(img.shape[0],img.shape[1])
print “Max value %1.2f at pixel %6d”%(img.max(),img.argmax())
print “Min value %1.2f at pixel %6d”%(img.min(),img.argmin())
print “Variance: %1.5f\nStandard deviation: %1.5f”%(img.var(),img.std())

Thresholding is done, as in matlab, by imposing an inequality on the matrix holding the data. The output is a True-False matrix of ones and zeros, where a one (white) indicates that the inequality is fulfilled, and zero (black) otherwise.

img>0.2 img>0.7
multiply(img<0.5,img>0.25) multiply(img,img>0.62)

By visual inspection of several different thresholds, we chose 0.62 as one that gives me a good map showing what we need for segmentation. We need to get rid of "outliers", though: small particles that might fulfill the given threshold, but are small enough not to be considered an actual atom. Therefore, in the next step we perform a morphological operation of opening to get rid of those small particles: we decided than anything smaller than a quare of size $latex 2 \times 2$ is to be eliminated from the output of thresholding:

BWatoms=img > 0.62

And we are now ready for segmentation. We perform this task with the scipy.ndimage.label function: It collects one slice per segmented atom, and the number of slices computed. We need to indicate what kind of connectivity is allowed. For example, in the toy example below, do we consider this as one or two atoms?

It all depends: We'd rather have it now as two different connected components, but for some other problems we might consider that they are just one. The way we code it, is by imposing a structuring element that defines feature connections. For example, if our criteria for connectivity between two pixels is that they are in adjacent edges, then the structuring element looks like the image in the left below. If our criteria for connectivity between two pixels is that they are also allowed to share a corner, then the structuring element looks like the image in the right below. For each pixel we impose the chosen structuring element, and count the intersection: if there are no intersections, then the two pixels are not connected. Otherwise, they belong to the same connected component.

We want to make sure that atoms that are too close in a diagonal direction are counted as two, rather than one, so we chose the structuring element on the left. The script reads then as follows:


Generation of the lattice of the oxide

The object segmentation contains a list of slices, each of them with a True-False matrix containing each of the found atoms of the oxide. We can obtain for each slice a great deal of useful information. For example, the coordinates of the centers of mass of each atom can be retrieved by

xcoords=array([x[1] for x in coords])
ycoords=array([x[0] for x in coords])

Note that, because of the way matrices are stored in memory, there is a trasposition of the $latex x$ and $latex y-$coordinates of the locations of the pixels. No big deal, but we need to take it into account.

Other useful bits of information available:

#Calculate the minima and maxima of the values of an array
#at labels, along with their positions.
# Calculate the mean of the values of an array at labels.

Notice the overlap of the computed lattice of points over the original image (below, left): we have successfully found the centers of mass of most atoms, although there are still about a dozen regions where we are not too satisfied with the result. It is time to fine-tune by the simple method of changing the values of some variables: play with the threshold, with the structuring element, with different morphological operations, etc. We can even add all the obtained information for a wide range of those variables, and filter out outliers. An example with optimized segmentation is shown below (right):

Structural Model of the sample

For the purposes of this post, we will be happy providing the Voronoi diagram and Delaunay triangulation of the previous lattice of points. With matplotlib we can take care of the latter with a simple command:


To show the results, we overlaid the original image (with a forced gray colormap) with the triangulation generated above (in blue).

The next step is the computation of the Voronoi diagram. Unfortunately, this is no easy task with sage yet. By interacting with the octave function voronoi this should be possible, but my attempts were not too fruitful. Instead, we generated a Voronoi map by brute force: the function scipy.ndimage.distance_transform_edt provides us with an exact euclidean distance transform for any segmentation. Once this transform is computed, we assigned to each pixel the value of the nearest segment.


Note that, since there are so many atoms (about 1600), the array stored in voronoi will have those many different colors. A raw visualization of that array will give us little insight on the structural model of the sample (below, left). Instead, we want to reduce the number of "colors", so that visualization is meaningful. We chose 64 different colors, and used an appropriate colormap for this task (below, center). Another possibility is, of course, to plot the edges of the Voronoi map (not to confuse it with the actual Voronoi diagram!): We can compute these with the simple command

edges=(edges > 0)

In the image below (right) we have included the lattice with the edges of the Voronoi diagram.

But the structural model will not be complete unless we include the information of the chambers of the sample, as well as that of the atoms. The reader will surely have no trouble modifying the steps above to accomplish that task. The combination of the information obtained by merging both Voronoi diagrams accordingly, will yield an accurate model for the sample at hand. More importantly, this procedure can be applied to any micrograph of similar appearance. Note that the quality and resolution of the acquired images will surely enforce different optimal values on thresholds, and fine-tuning of the other variables. But we can expect a decent result nonetheless. Even better outcomes are to be expected when we employ more serious techniques: the segmentation based on thresholding and morphology is weak compared to watershed techniques, or segmentations based on PDEs. Better edge-detection filters are also available.

All the computational tools are amazingly fast and, more importantly, freely available. This was simply unthinkable several years ago.

Generation of figures

One example will suffice. For instance, the commands used to save to png format the last image are given by:

matplotlib.pyplot.triplot(triang, ‘r.’, None)

Image Processing with numpy, scipy and matplotlibs in sage

Contact Us

Tizona Scientific Solutions LLC

230 Corley Woods Drive
Lexington, SC 29072

voice: (803) 386-1822

Sample Client Base


About Us

Tizona is a scientific consulting company dedicated to solving challenging problems through the development and application of advanced mathematical methods.

Tizona's employees are experts in their technical fields who have made the commitment to solve real-world problems. They excel in approximation theory, image processing, physical modeling, and software development. At Tizona, we believe in combining talent with willingness to work on challenging problems and to extend knowledge to other disciplines. This philosophy allows us to create effective approaches to problem solving and product development that has been highly successful.