Creating a Powerpoint from Excel.

Recently for work I wanted to present information that was in an Excel file using Powerpoint. The data in the Excel file has ongoing edits and I wanted a way to make it easier to keep the Powerpoint presentation synched with what is in Excel. I didn’t want to copy and paste or edit in two places.

For this example, I’m using a spreadsheet of types of trees. Thanks to the former City of Ottawa Forests and Greenspace Advisory Committee for compiling this data.

Excel spreadsheets can be read by Python’s Pandas into a dataframe:

import pandas as pd
spreadsheet_path = "/content/trees.xls"
trees_df = pd.read_excel(open(spreadsheet_path , 'rb'),  header=0) 

I wanted to take the dataframe and make a set of slides from it. Python-pptx creates Powerpoint files quite nicely. I installed it with: !pip install python-pptx

I created a class so that I could call python-pptx methods. The class would handle creating slides with tables, as per below. The full notebook is in Github.

from pptx import Presentation  
from pptx.util import Inches, Pt
import pandas as pd

class Ppt_presentation:

    # class attribute
    # Creating presentation object 
    ppt_presentation = Presentation() 

    # instance attribute
    def __init__(self):
        self.ppt_presentation = Presentation()
    
    def get_ppt_presentation(self):
        return self.ppt_presentation
    
    # Adds one slide with text on it
    def add_slide_text(self, title_text, body_text):
        # Adding a blank slide in out ppt 
        slide = self.ppt_presentation.slides.add_slide(self.ppt_presentation.slide_layouts[1])
        slide.shapes.title.text = title_text
        slide.shapes.title.text_frame.paragraphs[0].font.size = Pt(32)
        # Adjusting the width !   
        x, y, cx, cy = Inches(.5), Inches(1.5), Inches(8.5), Inches(.5)
        shapes = slide.shapes
        body_shape = shapes.placeholders[1]
        tf = body_shape.text_frame
        tf.text = body_text

    # Adds one slide with a table on it.  The content of the table is a Pandas dataframe
    def add_slide_table_df(self, df, title_text, col_widths):
        # Adding a blank slide in out ppt 
        slide = self.ppt_presentation.slides.add_slide(self.ppt_presentation.slide_layouts[5])
        slide.shapes.title.text = title_text
        slide.shapes.title.text_frame.paragraphs[0].font.size = Pt(32)
        # Adjusting the width !   
        x, y, cx, cy = Inches(.5), Inches(1.5), Inches(8.5), Inches(.5)  
        df_rows = df.shape[0]
        df_cols = df.shape[1]
        
        # Adding tables 
        table = slide.shapes.add_table(df_rows+1, df_cols, x, y, cx, cy).table
        ccol = table.columns
        
        
        for c in range(0,df_cols):
            table.cell(0, c).text = df.columns.values[c]
            ccol[c].width = Inches(col_widths[c])
 
        for r in range(0,df_rows):
            for c in range(0,df_cols):
                table.cell(r+1, c).text = str(df.iat[r,c])
                for p in range(0,len(table.cell(r+1, c).text_frame.paragraphs)):
                    table.cell(r+1, c).text_frame.paragraphs[p].font.size = Pt(12)

    # Adds a series of slides with tables.  The content of the tables is a Pandas dataframe.
    # This calls add_slide_table_df to add each slide.
    def add_slides_table_df(self, df, rows, title_text, col_widths):
        df_rows = df.shape[0]  
        if(rows > df_rows):
            self.add_slide_table_df(df, title_text, col_widths)
            return
        else:
            for df_rows_cn in range(0, df_rows, rows):
                print(df_rows_cn)
                rows_df_end = df_rows_cn + rows
                if rows_df_end > df_rows:
                    rows_df_end = df_rows
                rows_df = df.iloc[df_rows_cn:rows_df_end,:]
                self.add_slide_table_df(rows_df, title_text, col_widths)  
            return
    
    def save(self,filename):
        self.ppt_presentation.save(filename)

Below a title slide is created.

# import Presentation class 
# from pptx library 
from pptx import Presentation  
from pptx.util import Inches, Pt
import pandas as pd

ppres = Ppt_presentation()
ppres.add_slide_text("Salt Tolerance of Trees","November, 2020")
ppres.save("presentation.pptx")
print("done")

Next, I would like a set of slides with tables showing the common name of each type of tree, its botanical name and its salt tolerance. The data is read from Excel .xls into a dataframe.

trees_df = pd.read_excel(open(spreadsheet_path , ‘rb’), header=0)

The rows and columns to be presented in the table are selected from the dataframe:

# Specify the rows and columns from the spreadsheet
cols_df = trees_df.iloc[0:132,[1,3,16]]

The column widths of the table in Powerpoint are set:

col_widths = [1.5,3.5,3.5]

The add_slides_table_df method of Ppt_presentation class is called:

ppres.add_slides_table_df(cols_df, 15, “Trees: Common name, Latin Name, Salt Tolerance.”,col_widths)

# import Presentation class 
# from pptx library 
from pptx import Presentation  
from pptx.util import Inches, Pt
import pandas as pd

ppres = Ppt_presentation()
ppres.add_slide_text("Salt Tolerance of Trees","November, 2020")

spreadsheet_path = "/content/trees.xls"

trees_df = pd.read_excel(open(spreadsheet_path , 'rb'),  header=0) 
# We have some missing values. These need to be fixed, but for purposes today, replace with -
trees_df = trees_df.fillna("-")

# Specify the rows and columns from the spreadsheet
cols_df = trees_df.iloc[0:132,[1,3,16]]

# Add slides with tables of 8 rows from the dataframe
# Specify the column widths of the table in inches
col_widths = [1.5,3.5,3.5]
ppres.add_slides_table_df(cols_df, 15, "Trees: Common name, Latin Name, Salt Tolerance.",col_widths)

ppres.save("presentation.pptx")
print("done")

Slides grouping trees by their salt tolerance is useful when considering trees for a particular site. The dataframe is sorted and grouped per below:

# Group results in dataframe by unique value
# Sort values for second column
salt_tolerance_df = trees_df.sort_values([‘SaltToleranceEn’,’NameBotanical’])
salt_tolerance_df = salt_tolerance_df.groupby([‘SaltToleranceEn’])[‘NameBotanical’].apply(‘, ‘.join).reset_index()

# import Presentation class 
# from pptx library 
from pptx import Presentation  
from pptx.util import Inches, Pt
import pandas as pd

ppres = Ppt_presentation()
ppres.add_slide_text("Salt Tolerance of Trees","November, 2020")

spreadsheet_path = "/content/trees.xls"

trees_df = pd.read_excel(open(spreadsheet_path , 'rb'),  header=0) 
#We have some missing values. These need to be fixed, but for purposes today, replace with -
trees_df = trees_df.fillna("-")

# Specify the rows and columns from the spreadsheet
cols_df = trees_df.iloc[0:132,[1,3,16]]

# Group results in dataframe by unique value
# Sort values for second column
salt_tolerance_df = trees_df.sort_values(['SaltToleranceEn','NameBotanical'])
salt_tolerance_df = salt_tolerance_df.groupby(['SaltToleranceEn'])['NameBotanical'].apply(', '.join).reset_index()

#Add slides with tables of 2 rows from the dataframe
col_widths = [1.5,7]
ppres.add_slides_table_df(salt_tolerance_df, 2, "Salt Tolerance",col_widths)

ppres.save("presentation.pptx")
print("done")

These slides are simple and need more formatting, but that can be done with Python-pptx too.

LiDAR LAS files

LiDAR captures details of the Earth’s surface and LAS files contain these data in 3 dimensional points of x, y and z coordinates. I want to take these data and use them to model some of the objects LiDAR detects.

I’m using Python’s laspy library, documented here and installed per below.

!pip install laspy

Laspy works with LAS files. The file I’m working with is from Pennsylvania and this type of data is available for a wide number of geographies. Here’s a list of LiDAR sources for Canadian provinces.

Open the file:

import numpy as np
from laspy.file import File
inFile = File("/content/drive/My Drive/MaskCNNhearths/code_and_working_data/test_las/24002550PAN.las", mode = "r")

From the laspy tutorial, here is a list of fields in the LAS file.

# Code from the laspy tutorial https://pythonhosted.org/laspy/tut_part_1.html
# Find out what the point format looks like.
pointformat = inFile.point_format
for spec in inFile.point_format:
    print(spec.name)

print("------------")

header = inFile.header

print(header.file_signature)
print(header.file_source_id)
print(header.global_encoding)
print(header.guid)
print(header.max)
print(header.offset)
print(header.project_id)
print(header.scale)
print(header.schema)

print("------------")

#Lets take a look at the header also.
headerformat = inFile.header.header_format
header = inFile.header
for spec in headerformat:
    print(spec.name)

Initially when I looked at plotting data in LAS files, I saw results that didn’t resemble the landscape, like below:

The points have a raw_classification (see below). Sorting for points where raw_classification = 2 provides a closer representation of the surface.

%matplotlib inline
import matplotlib.pyplot as plt
plt.hist(inFile.raw_classification)
plt.title("Histogram of raw_classification")

Below, the points in the LAS file are reduced to those that are just raw_classification = 2 and are within coordinates (2553012.1, 231104.9) and
(2553103.6,231017.5).


#https://jakevdp.github.io/PythonDataScienceHandbook/04.12-three-dimensional-plotting.html
from mpl_toolkits import mplot3d
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

# Data for three-dimensional scattered points
ax = plt.axes(projection='3d')
coords = np.vstack((inFile.x, inFile.y, inFile.z, inFile.raw_classification)).transpose()

#2553012.1,231104.9
#2553103.6,231017.5

#2556834.3,230831.6
#2556985.1,230582.2
x1,y1 = 2553012.1,231104.9
x2,y2 = 2553103.6,231017.5

# raw_classification = 2
keep = coords[:,3]==2
keep_points = keep == True
coords = coords[keep_points]

keep = coords[:,0] > x1
keep_points = keep == True
coords = coords[keep_points]

keep = coords[:,0] < x2
keep_points = keep == True
coords = coords[keep_points]

keep = coords[:,1] > y2
keep_points = keep == True
coords = coords[keep_points]

keep = coords[:,1] < y1
keep_points = keep == True
coords = coords[keep_points]

print("We're keeping %i points out of %i total"%(len(coords), len(inFile)))

zdata = coords[:,2]
xdata = coords[:,0]
ydata = coords[:,1]
ax.scatter3D(xdata, ydata, zdata, c=zdata, cmap='Greens');
fig = plt.figure(figsize=(20,10))
ax = plt.axes(projection='3d')

ax.plot_trisurf(xdata, ydata, zdata,
                cmap='viridis', edgecolor='none');
LAS file data plotted for a rectangle.

A TRS-80 Color Computer Photo Filter with 4 colors.

Following the previous post, this is a photo filter using the TRS-80 Color Computer’s higher resolution graphics.

Resolution: PMODE 1,1 has 128 x 96 pixels on the screen. I have used a grayscale photograph resized to 128 x 96.

Colors: PMODE 1 has two sets of four colors: [green, yellow, blue and red] and [buff, cyan, magenta and orange].

This program loops through each pixel in the grayscale photograph and converts it to a value representing one of the available four colors, depending on how dark the original pixel is. I am using yellow, green, red and blue to represent light to dark.

In PMODE 1 graphics represent bytes that store values for four pixels in a horizontal row. Two bits for each pixel represent its color:
00b or 0 is green. [or buff]
01b or 1 is yellow. [or cyan]
10b or 2 is blue. [or magenta]
11b or 3 is red. [or orange]

00011011 is a byte representing a green pixel, yellow pixel, blue pixel and red pixel.
00000000 4 green pixels
11111111 4 red pixels
01010101 4 yellow pixels

What is a little different from the previous program is POKE is used to store the byte values into video memory of the TRS-80. Storing the byte values in DATA statements rather and individual POKE statements made the program smaller and faster to load and run. Below is the python to generate the program. Here is the Color Computer program I load into XRoar Online.

def pixel_to_bit(pixel_color):
    #green,yellow,blue,red
    if pixel_color<48:
        #red 10
        color_bits=2
    if pixel_color>=48 and pixel_color<96:
        #blue 11
        color_bits=3
    if pixel_color>=96 and pixel_color<150:
        #green 00
        color_bits=0
    if pixel_color>=150:
        #yellow 01
        color_bits=1
    return color_bits

file = open('C:\\a_orgs\\carleton\\data5000\\xroar\\drawjeff_pmode1.asc','w') 
file.write("1 PMODE 1,1\r ")
file.write("2 SCREEN 1,0\r ")
for row in range(0,96,1):
    file.write(str(linenum)+" DATA ")
    for col in range(0,127,4):        
        
        linenum = 10+row*96+col
        #PMODE 1 - Graphics are bytes that store values for 4 pixels in a horizontal row
        # 2 bits for each pixel represent its color.
        # 00b 0 green or cyan
        # 01b 1 yellow or 
        # 10b 2 blue
        # 11b 3 red
        # 00011011 is a byte with a green, yellow, blue and red pixels
        # 00000000 4 green pixels
        # 11111111 4 red pixels
        # 01010101 4 yellow pixels
        byte_val=0
        for byte_col in range(0,4):
            color_bits=pixel_to_bit(resized128x96[row,col+(3-byte_col)])
            byte_val=byte_val+(color_bits*(2**(byte_col*2)))
        #memory_location = int(1536+(row*(96/4)+(col/4)))
        file.write(str(byte_val))
        if(col<124):
            file.write(",")
    file.write("\r ")

file.write("9930 FOR DC=1 TO 3072\r ")
file.write("9935 READ BV\r ")
file.write("9940 POKE 1536+DC,BV\r ")
file.write("9950 NEXT DC\r ")
    
file.write("9999 GOTO 9999\r ")
file.close()
A four color filter.

An 80’s Themed Image Filter for Photographs.

This week in our seminar for HIST5709 Photography and Public History we discussed a reading from Nathan Jurgenson’s The Social Photo. He describes the use of image filters to give digital photographs an aesthetic that evokes nostalgia and authenticity. Below is a description of a retro image filter inspired by the Radio Shack Color Computer.

I was very fortunate that my parents bought this computer when we moved to Ottawa in 1983. I learned a lot using it and I used it a lot. As much as I loved it, I found its basic graphic mode odd. Coloured blocks were available, but instead of being square they were about 1.5 times tall as they were wide. (See a list of the yellow blocks below.)

While I used these blocks for a few purposes, like printing large letters, their rectangular shape made them less satisfactory for drawing. Still, they are distinctive. I wanted to see if I could make an image filter with them and evoke a sense of the 1980’s.

I used the Xroar emulator as a virtual Color Computer rather than going fully retro and using the actual machine. See: https://www.6809.org.uk/xroar/. It takes a few steps to set up on a computer. There is an easier to run on-line version of the CoCo at: http://www.6809.org.uk/xroar/online/. To follow along, just set Machine: “Tandy CoCo (NTSC)” in the menu for XRoar Online.

Above: Color Computer running on XRoar Online.

To see one of these graphical blocks, type in PRINT CHR$(129) and hit Enter in XRoar. (And note that the CoCo keyboard uses Shift+8 for ( and Shift+9 for ). Try a few different values like PRINT CHR$(130) or 141 and you will see rectangular graphics like the yellow blocks in the screen above.

Using these to represent a photograph provides a maximum resolution of 64 pixels wide X 32 pixels tall. (The screen is 32 characters wide with 16 rows.) I wanted to leave a row for text, so I used a resolution of 64 X 30. However, since the pixels are 1.5 times taller than wide I would use a photograph with an aspect ratio of 64X45 (30*1.5).

I used the picture below. It’s a screen grab my daughter took that has some contrast and could be used for my Twitter profile.

Raw image in grayscale. It’s 192X135 or 3 times larger than 64×45.

Here’s the Python code I used:

# import the necessary packages
# Credit to: Adrian Rosebrock https://www.pyimagesearch.com/
from imutils import paths
from matplotlib import pyplot
import argparse
import sys
import cv2
import os

import shutil
from pathlib import Path
img_local_folder = "C:\\xroar\\"
path = Path(img_local_folder)
os.chdir(path)
# 192X135 is used since it's a multiple of 64X45. 
img_file_name = "jeffb_192x135.jpg"
hash_image = cv2.imread(img_file_name)

# if the image is None then we could not load it from disk (so skip it)
if not hash_image is None:
    # convert the image to grayscale and compute the hash
    pyplot.imshow(hash_image)    
    pyplot.show()

    hash_image = cv2.cvtColor(hash_image, cv2.COLOR_BGR2GRAY)
    pyplot.imshow(hash_image,cmap='gray')    
    pyplot.show()

    # resize the input image to 64 pixels wide and 30 high.
   
    resized = cv2.resize(hash_image, (64, 30))
    pyplot.imshow(resized,cmap='gray')
    pyplot.show()
    
else:
    print("no image.")
A flattened 64X30 image.

Let’s convert this to black and white.

#convert the grayscale to black and white using a threshold of 92
(thresh, blackAndWhiteImage) = cv2.threshold(resized, 92, 255, cv2.THRESH_BINARY)
print(blackAndWhiteImage)
pyplot.imshow(blackAndWhiteImage,cmap='gray')
pyplot.show()
A black and white version.

This image needs to be translated in order to import in into the CoCo. We will turn it into a BASIC program of PRINT statements. Here is a sample of this very simple and inefficient program.

 150 PRINT CHR$(128);
 152 PRINT CHR$(128);
 154 PRINT CHR$(143);
 156 PRINT CHR$(143);
 158 PRINT CHR$(143);
 160 PRINT CHR$(143);
 162 PRINT CHR$(131);
 164 PRINT CHR$(131);
 166 PRINT CHR$(131);
 168 PRINT CHR$(135);

This program is generated by Python. Python loops through squares of 4 pixels in the black and white image. For each pixel that has a color of 255/white, the appropriate bit (top/bottom, left/right) on the rectangular graphic block is set to having color.

file = open('C:\\xroar\\xroar-0.35.2-w64\\drawjeff.asc','w') 
for row in range(0,30,2):
    for col in range(0,64,2):        
        linenum = row*64+col
        # bit 1 is lower left
        bit1=0
        # bit 2 is lower right
        bit2=0
        # bit 4 is top left
        bit4=0
        # bit 8 is top right
        bit8=0
        # if a pixel is white (255) - set the bit to 1 (green) else, the bit is 0 (black)
        if(blackAndWhiteImage[row,col]==255):
            bit8=8
        if(blackAndWhiteImage[row,col+1]==255):
            bit4=4
        if(blackAndWhiteImage[row+1,col]==255):
            bit2=2
        if(blackAndWhiteImage[row+1,col+1]==255):
            bit1=1
        chr = 128+bit1+bit2+bit4+bit8
        # write the statement into the program.
        file.write(str(linenum)+" PRINT CHR$("+str(chr)+");\r ")
    # write an end of line statement if line is less than 32 characters
    #file.write(str(linenum)+" PRINT CHR$(128)\r ")
file.close()

A sample of the generated program is here. To run it, in XRoar Online click Load and select the downloaded drawjeff.asc file. Then type CLOAD <enter> in the emulator. (See below.)

Loading will take a moment. Imagine popping a cassette into a tape recorder, typing CLOAD and pressing the play button. F DRAWJEFF will be displayed will the file is loaded.

This will appear during the loading of the file.

Once loaded, OK will appear. Type RUN.

A photograph image filter… Imagination required.

It’s neat that there is an on-line emulator for a computer from almost 4 decades ago. It’s also neat that Python can write programs that will run on it.

The book Getting Started with Extended Color BASIC is on the Internet Archive. I loved this book and think it’s an excellent introduction to programming. There are lots of ideas to try out on XRoar.

Matching images using Image Hashing

This is a brief post of my notes describing the process to match similar images in an archive of photographs. I am using the techniques described by Adrian Rosebrock in his excellent article Image hashing with OpenCV and Python. The images used are from the Pompeii Artistic Landscape Project and provided courtesy of Pompeii in Pictures.

Image hashing is a process to match images through the use of a number that represents a very simplified form of an image, like this one below.

Original image before image hashing. Images courtesy of Pompeii in Pictures.

First, the color of the image is simplified. The image is converted to grayscale. See below:

Image converted to grayscale. Images courtesy of Pompeii in Pictures.

Next, the image is simplified by size. It is resized 9 pixels wide by 8 pixels high.

Image resized to 9 pixels wide by 8 pixels high. The green and red rectangles are relevant to describe the next step.

Adrian Rosebrock uses a differential hash based on brightness to create a binary number of 64 bits. Each bit is 1 or 0. Two pixels next to each other horizontally are compared: left and right. If right is brighter, bit = 1. Bit = 0 if left is brighter. See below:

The result of the image hash for the image above. The 1 inside the green square is the result of the comparison between the 2 pixels in the green rectangle in the picture above. The same thing is true for the o inside the red square. Inside the red rectangle two images above, the pixel on the left is brighter, so 0 is the result.

This process produces a 64bit binary number: 0101001100000011101001111000101110011101000011110000001001000011

This converts to decimal 5981808948155449923.

Matches

A match of copies of an image.
An interesting match of similar images.

References

Dunn, Jackie and Bob Dunn. Pompeii in Pictures.

Rosebrock, Adrian. Image hashing with OpenCV and Python.

Building a Wall Construction Detection Model with Keras.

I am building a project to detect wall construction types from images of Pompeii. I am using Waleed Abdulla’s Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow. Also, I employ the technique described by Jason Brownlee in How to Train an Object Detection Model with Keras to detect kangaroos in images. Instead of kangaroos, I want to detect the type of construction used in building walls.

This is a brief post describing the preparation of images for training as well as the initial results. The images used are from the Pompeii Artistic Landscape Project and provided courtesy of Pompeii in Pictures. The original images were photographed by Buzz Ferebee and they have been altered by the program used for predictions. An example of an image showing the model’s detection of construction type opus incertum is below. Cinzia Presti created the data used to select the images for training.

The red rectangles note the model’s prediction of opus incertum as a wall construction type. Image courtesy of Pompeii in Pictures. Originally photographed by Buzz Ferebee.

To build this model, images were selected for training. Given the construction type is visible in only parts of each image, rectangles in each image show where the construction type is visible.

Image showing areas designated for training the model to detect opus incertum. File name: 00096.jpg. Image courtesy of Pompeii in Pictures. Originally photographed by Buzz Ferebee.

Each of the images has a corresponding xml file containing the coordinates of the rectangles that contain the objects used to train on. See file 00096.xml below:

<annotation>
	<folder>opus_incertum</folder>
	<filename>00096.jpg</filename>
	<path>/home/student/data_5000_project/data/images/construction_types/raw/opus_incertum/pompeiiinpictures Ferebee 20600 May 2016 DSCN8319.JPG</path>
	<source>
		<database>pompeiiinpictures.com</database>
	</source>
	<size>
		<width>1024</width>
		<height>768</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>opus_incertum</name>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>16</xmin>
			<ymin>579</ymin>
			<xmax>257</xmax>
			<ymax>758</ymax>
		</bndbox>
	</object>
	<object>
		<name>opus_incertum</name>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>507</xmin>
			<ymin>563</ymin>
			<xmax>703</xmax>
			<ymax>749</ymax>
		</bndbox>
	</object>
	<object>
		<name>opus_incertum</name>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>829</xmin>
			<ymin>570</ymin>
			<xmax>1007</xmax>
			<ymax>752</ymax>
		</bndbox>
	</object>
</annotation>

The program to create the xml annotation files also saves images using a standard numeric file name (ex.: 00001.jpg) and width of 1024 pixels.

Initial Results

The “Actual” column of images below shows images used in training the model. The white rectangles show the boundary boxes contained in the corresponding xml file for the image. Some images don’t have a white rectangle. These images were deemed by me not to have a good enough sample for training so I didn’t make an xml file for them.

The “Predicted” column shows what the model considers to be opus incertum construction. Frequently it’s correct. It does make errors too, considering the blue sky in row 5 is recognized as stone work. I want to see if further training can correct this.

A couple things to note: It’s bad practice to run a model on images used to train it, but I am doing this here to verify it’s functioning. Later, I also need to see how the model performs on images with no opus incertum.

References

Abdulla, Waleed. Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow. GitHub repository. Github, 2017. https://github.com/matterport/Mask_RCNN.

Brownlee, Jason. How to Train an Object Detection Model with Keras. Machine Learning Mastery. https://machinelearningmastery.com/how-to-train-an-object-detection-model-with-keras/.

Dunn, Jackie and Bob Dunn. Pompeii in Pictures.

Ferebee, Buzz. Pompeii Photographic archive. 2015-2017

Presti, Cinzia. Image Classfication Workspace.

Using Box.com’s API to get images.

This is a note about how I connected to box.com using its API so that a python program could download images and meta data.

Details of the API are here https://developer.box.com/en/guides/authentication/oauth2/with-sdk/

To connect to Box, I needed to make an app. See the link “My Apps” in the SDK link above.

Click My Apps

Create a new app.

Click Create New App

Give your app a name.

Give your app a name.

I used OAuth 2.0 Authentication

I used standard OAuth 2.0. You will need your Client ID and Client Secret later in this process. Protect this information and don’t put it directly in your code.
I used my website for a Redirect URI and limited the Scope to Read only.

I put the client_id and client_secret values into a json file that looks like this:

{
"client_id":"ryyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy",
"client_secret":"Vzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz"
}

Here is the code to connect:

!pip install boxsdk
!pip install auth
!pip install redis
!pip install mysql.connector
!pip install requests

from boxsdk import OAuth2

import json
#Set the file we want to use for authenticating a Box app
#The json file stores the client_id and client_secret so we don't have it in the code.
# The json file looks like this:
#{
#"client_id":"___the_codes_for_client_id___",
#"client_secret":"___the_codes_for_client_secret___"
#}

oauth_settings_file = 'C:\\ProgramData\\box_app_test.json'
with open(oauth_settings_file, "r") as read_file:
    oauth_data = json.load(read_file)
print(oauth_data["client_id"])
print(oauth_data["client_secret"])

oauth = OAuth2(
    client_id=oauth_data["client_id"],
    client_secret=oauth_data["client_secret"]
)

auth_url, csrf_token = oauth.get_authorization_url('https://jeffblackadar.ca')
print("Click on this:")
print(auth_url)
print(csrf_token)
print("Copy the code that follows code= in the URL.  Paste it into the oauth.authenticate('___the_code___') below.  Be quick, the code lasts only a few seconds.")

I ran the code above in a Jupyter notebook. The output is:

ryyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
 Vzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
 Click on this:
 https://account.box.com/api/oauth2/authorize?state=box_csrf_token_Qcccccccccccccccccccccc&response_type=code&client_id=ryyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy&redirect_uri=https%3A%2F%2Fjeffblackadar.ca
 box_csrf_token_Qcccccccccccccccccccccc
 Copy the code that follows code= in the URL.  Paste it into the oauth.authenticate('the_code') below.  Be quick, the code lasts only a few seconds.

You will notice the Redirect URI set above appears when the URL above is clicked. But first you must authenticate with Box.com using your password to make sure only authorized users read your content.

Log in with the user ID that has access to your content.
Click Grant access to Box
Copy the code (but just the code.) Paste it into the python program below.

Paste the code above into the statement below. You need to work quickly, the code is valid for a few seconds only. There is a better way to do this, but this is what is working at this time, please let me know of improvements.

from boxsdk import Client

# Make sure that the csrf token you get from the `state` parameter
# in the final redirect URI is the same token you get from the
# get_authorization_url method to protect against CSRF vulnerabilities.
#assert 'THE_CSRF_TOKEN_YOU_GOT' == csrf_token
access_token, refresh_token = oauth.authenticate('qzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz')
client = Client(oauth)

Then run this test. It will list all of the files in the folders on box.com.

def process_subfolder_test(client, folder_id, folder_name):
    print("this folder: "+folder_name)
    items = client.folder(folder_id=folder_id).get_items()
    for item in items:
        print('{0} {1} is named "{2}"'.format(item.type.capitalize(), item.id, item.name))
        if(item.type.capitalize()=="Folder"):
            process_subfolder_test(client, item.id,folder_name+"/"+item.name)
        if(item.type.capitalize()=="File"):
            #print(item)
            print('File: {0} is named: "{1}" path: {2} '.format(item.id, item.name, folder_name+"/"+item.name))            
    return

process_subfolder_test(client, '0',"")

Here is the test output:

this folder: 
Folder 98208868103 is named "lop"
this folder: /lop
Folder 98436941432 is named "1963"
this folder: /lop/1963
File 588118649408 is named "Elizabeth II young 2019-08-10 15_41_20.591925.jpg"
File: 588118649408 is named: "Elizabeth II young 2019-08-10 15_41_20.591925.jpg" path: /lop/1963/Elizabeth II young 2019-08-10 15_41_20.591925.jpg 
File 588114839194 is named "Elizabeth II young 2019-08-10 15_41_52.188758.jpg"
File: 588114839194 is named: "Elizabeth II young 2019-08-10 15_41_52.188758.jpg" path: /lop/1963/Elizabeth II young 2019-08-10 15_41_52.188758.jpg 
File 587019307270 is named "eII2900.png"
File: 587019307270 is named: "eII2900.png" path: /lop/eII2900.png 
File 587019495720 is named "eII2901.png"
File: 587019495720 is named: "eII2901.png" path: /lop/eII2901.png 
File 587019193229 is named "eII2903.png"
File: 587019193229 is named: "eII2903.png" path: /lop/eII2903.png 

Performance of model parameters

I processed 1458 models in this spreadsheet (see models tab). As mentioned in my previous post, these are the parameters:

  • model_number – the identification number
  • batch_size – the size of the batch. 8 or 16
  • filters1 – the number of filters for layer 1. (possible values 32,64 or 128)
model.add(Conv2D(filters=filters1, 
  • dropout1 – Dropout (if greater than 0) (possible values 0,0.25,0.5)

if(dropout1>0):
model.add(Dropout(dropout1))

  • filters2 – the number of filters for layer 2. (32,64 or 128)
  • dropout2 – dropout for layer 2. (0,0.25,0.5)
  • filters3 – the number of filters for layer 3. (32,64 or 128)
  • dropout3 – dropout for layer 3. (0,0.25,0.5)
  • loss – the result of running the model.
  • accuracy – (as above.)

A review of the spreadsheet shows that many of the models I ran have poor accuracy, even as low as 1:3 (0.333333333333333) to predict a match between three coin obverse portraits (Elizabeth II, George VI and Abraham Lincoln). I did find some models with an accuracy above 80% yet I wanted to see if there were patterns I could use to improve my set of models. So I used a Seaborn heatmap of the models (below) for batch sizes of 8,16 and both together.

Heatmap of models 2 – 730 (batch size = 8). There is a slightly negative relationship between accuracy and dropout1. It is possible it would be more efficient to use dropout values of 0 or 0.25 and not 0.5.
Heatmap of models 731 – 1459 (batch size = 16).
Heatmap of models 2 – 1459 (batch sizes of 8,16).

The heatmap for loss and accuracy to the model parameters in the last two rows shows there is a slightly negative relationship between accuracy and dropout1. It is possible it would be more efficient to use dropout values of 0 or 0.25 and not 0.5 when running these models again. It also seems like there is a slightly positive relationship between batch size and accuracy possibly indicating that larger batch sizes may lead to more accurate models. I have been running a set of models with a batch size of 32 to see if this pattern becomes stronger. (same spreadsheet, models tab.) I am also going to validate my approach through additional personal (not machine) learning.

Regularizing the image recognition model.

In Deep Learning for Python, François Chollet provides a “universal workflow of machine learning”. (Chapter 4, page 114.) I have been using his steps to seek the best performing image recognition model. I tried iterations of various models with different numbers of layers, filters and dropouts. An example of a model that did not provide a satisfactory level of accuracy is below.

def createModel5fail2():

    #tried kernel_size=(5,5), 

    from keras import models
    model = models.Sequential()    

    model.add(Conv2D(filters=32, 
               kernel_size=(5,5), 
               strides=(1,1),
               padding='same',
               input_shape=(image_width, image_height,NB_CHANNELS),
               data_format='channels_last'))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2),
                     strides=2))

    model.add(Conv2D(filters=64,
               kernel_size=(5,5),
               strides=(1,1),
               padding='valid'))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2),
                     strides=2))
    
    model.add(Flatten())        
    model.add(Dense(128))
    model.add(Activation('relu'))

    model.add(Dropout(0.25))
    
    #number of classes
    # 1,0,0 E-II
    # 0,1,0 G-VI
    # 0,0,1 G-VI
    model.add(Dense(3, activation='softmax'))

    return model 

In order to be more methodical and record my results I added a spreadsheet of model parameters (see models tab). These are the parameters:

  • model_number – the identification number
  • batch_size – the size of the batch. 8 or 16
  • filters1 – the number of filters for layer 1.
model.add(Conv2D(filters=filters1, 
  • dropout1 – Dropout (if greater than 0)

if(dropout1>0):
model.add(Dropout(dropout1))

  • filters2 – the number of filters for layer 2.
  • dropout2 – dropout for layer 2.
  • filters3 – the number of filters for layer 3.
  • dropout3 – dropout for layer 3.
  • loss – the result of running the model.
  • accuracy – (as above.)

The code to create the spreadsheet of parameters is here. (It’s just nested loops.) Below is the code to create a model from parameters fed from the spreadsheet. In the course of writing up this post, I found 2 bugs in the code below that are now corrected. Because of the bugs I need to re-run my results.

def createModelFromSpreadsheet():

    from keras import models
    model = models.Sequential()
    

    model.add(Conv2D(filters=filters1, 
               kernel_size=(2,2), 
               strides=(1,1),
               padding='same',
               input_shape=(image_width, image_height,NB_CHANNELS),
               data_format='channels_last'))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2),
                     strides=2))
    if(dropout1>0):
        model.add(Dropout(dropout1))
    
    model.add(Conv2D(filters=filters2,
               kernel_size=(2,2),
               strides=(1,1),
               padding='valid'))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2),
                     strides=2))

    if(dropout2>0):
        model.add(Dropout(dropout2))

    if(filters3>0): 
        model.add(Conv2D(filters=filters3,
               kernel_size=(2,2),
               strides=(1,1),
               padding='valid'))
        model.add(Activation('relu'))
        model.add(MaxPooling2D(pool_size=(2,2),
                     strides=2))

        if(dropout3>0):
            model.add(Dropout(dropout3))

    
    model.add(Flatten())        
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.25))
    
    #number of classes
    # 1,0,0 E-II
    # 0,1,0 G-VI
    # 0,0,1 G-VI
    model.add(Dense(3, activation='softmax'))

    return model
  

Below is the code to loop through each row of the model spreadsheet, create a model from the parameters, fit it and record the result.

number_of_models_to_run = 40
for number_of_models_to_run_count in range (0,number_of_models_to_run):
    model_row = int(worksheet_config.cell(1, 2).value)

    BATCH_SIZE = int(worksheet_models.cell(model_row, 2).value,0) #, 'batch_size')
    filters1 = int(worksheet_models.cell(model_row, 3).value,0) #, 'filters1')
    dropout1 = float(worksheet_models.cell(model_row, 4).value) #, 'dropout1')
    filters2 = int(worksheet_models.cell(model_row, 5).value,0) #, 'filters2')
    dropout2 = float(worksheet_models.cell(model_row, 6).value) #, 'dropout2')
    filters3 = int(worksheet_models.cell(model_row, 7).value,0) #, 'filters3')
    dropout3 = float(worksheet_models.cell(model_row, 8).value) #, 'dropout3')

    print(str(model_row)+" "+str(BATCH_SIZE)+" "+str(filters1)+" "+str(dropout1)+" "+str(filters2)+" "+str(dropout2)+" "+str(filters3)+" "+str(dropout3))
    # NB_CHANNELS = # 3 for RGB images or 1 for grayscale images
    NB_CHANNELS =  1
    NB_TRAIN_IMG = 111
    # NB_VALID_IMG = # Replace with the total number validation images  
    NB_VALID_IMG = 54


    #*************
    #* Change model
    #*************
    model2 = createModelFromSpreadsheet()
    model2.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
    model2.summary()

    epochs = 100

    # Fit the model on the batches generated by datagen.flow().
    history2 = model2.fit_generator(datagen.flow(tr_img_data , tr_lbl_data, batch_size=BATCH_SIZE),
                                  #steps_per_epoch=int(np.ceil(tr_img_data .shape[0] / float(batch_size))),
                                  steps_per_epoch=NB_TRAIN_IMG//BATCH_SIZE,
                                  epochs=epochs,
                                  validation_data=(val_img_data, val_lbl_data),
                                  validation_steps=NB_VALID_IMG//BATCH_SIZE,
                                  shuffle=True,
                                  workers=4)

    evaluation = model2.evaluate(tst_img_data, tst_lbl_data)
    print(evaluation)
    print(evaluation[0])
    #record results
    worksheet_models.update_cell(model_row, 10, evaluation[0])
    worksheet_models.update_cell(model_row, 11, evaluation[1])
    worksheet_config.update_cell(1, 2, str(model_row+1))
    if(evaluation[1]>0.75):
        number_of_models_to_run_count = number_of_models_to_run
        print("Good Model - stopped")

In the course of running these models, I had a model that provided 77% image recognition accuracy when tested and so I saved the weights. Due to the bugs I found I am re-running my results now to see if I can reproduce the model and find a better one.

Image Classification – Tuning models

Since the start of September I have been working to improve my image classification model. The positive result is that I have a model that is capable of categorizing 3 different types of coins, however the model is not yet as accurate as it needs to be. For reference here is my working code

Categorizing three different types of coin images.

I have added photos of Abraham Lincoln to the collection of coin photos I am using for training. Each class of photo is “one hot label” encoded to give it an identifier that can be used in the model: 1,0,0 = Elizabeth; 0,1,0 = George VI and 0,0,1 = Abraham Lincoln. (Continuing this pattern, additional classes of coins can be added for training.) Below is the code that does this based on the first three characters of the photo’s file name.

def one_hot_label(img):
label = img.split('.')[0]
label = label[:3]
if label == 'eII':
ohl = np.array([1,0,0])
elif label == 'gvi':
ohl = np.array([0,1,0])
elif label == 'lin':
ohl = np.array([0,0,1])
return ohl
(above) An example of an image of Abraham Lincoln used in training the model. This image has a label of 0,0,1 to indicate that it belongs to the same class as other images of Lincoln. (I am a little concerned that the numbers of the year and letters from “Liberty” will interfere with the training.)

The model I have trained can recognize Abraham Lincoln more times that it does not.

predict_for('/content/drive/My Drive/coin-image-processor/portraits/test/all/linc4351.png')
produced a result of [0. 0. 1.], which is correct. The model fails to accurately predict some of the other images of Lincoln.

Model Accuracy

When training the model I monitor the loss and accuracy for both training and validation. Validation accuracy is where the model checks its effectiveness against a set of validation images. Training accuracy is a measure of how well the model is performing using its training data. A model is functioning well if its training accuracy and validation accuracy are both high .

 Epoch 16/150 13/13 [==============================] - 0s 23ms/step - loss: 0.8050 - acc: 0.5769 - val_loss: 10.7454 - val_acc: 0.3333 

As shown above, at this point in the training of this model, the training accuracy (acc:) is low (57.6%) and the validation accuracy (val_acc:) is even lower (33%). For an image prediction between 3 different types of coins, this model is validated to be as accurate as rolling a die.

A graph of the accuracy of a model over 150 epochs of training.

The red line of the training accuracy in the graph above shows a model that becomes more accurate over time. The accuracy of the model is very low initially, but it does climb almost continuously.

The validation accuracy of the model also begins quite low. Consider the area of the graph inside the magenta box denoted by (T). During this training, val_acc stalls at 33% between epochs 5 and 25. During my experiments with different model configurations, if I saw this stall happen I would terminate the training to save time. Considering what happened here, I should let the models run longer. This model eventually achieved a validation accuracy of 78%, the best result I had in the past couple of days.

Overfitting

The validation accuracy of this model peaks at epoch 88. As it declines, the training accuracy of the model continues a trend to higher accuracy. This is a sign that the model is overfitting and training for features that are present in the training data but won’t be generally present for other images. An overfit model is not useful for recognizing images from outside of its training set. This information is useful since it signifies that this model should be trained for approximately 88 epochs and not 150. At the same time, this particular model still needs work. Even with a validation accuracy of 77%, the model is still likely overfit given it has a training accuracy of 90%. So it is likely that this model will make errors of prediction when used with new images of our coin subjects.